How to Choose the Right Data Structure for a Problem

Choosing the right data structure is the most impactful decision you make when solving a coding problem. The same algorithm with a HashMap vs an Array can be the difference between O(n²) and O(n). This guide gives you a systematic framework for picking the right data structure every time — covering when to use arrays, hash maps, stacks, queues, heaps, trees, and graphs, with real code examples and interview problem patterns.

O(1)

HashMap lookup — the most powerful optimization

O(log n)

sorted array binary search or BST operations

O(n)

unsorted array search — often avoidable with right structure

Heap

the answer when you need repeated min/max in O(log n)

1

The Decision Framework — Questions to Ask First

Rather than memorizing which structure to use for each problem type, train yourself to ask these questions in order. Each question narrows down the candidates significantly.

Decision questions in order

Ask these questions in order: 1) Do I need O(1) lookups by key? → HashMap or Set. 2) Do I need sorted order? → Sorted Array or TreeMap. 3) Do I process in LIFO order? → Stack. 4) Do I process in FIFO order? → Queue. 5) Do I need to find min/max repeatedly? → Heap. 6) Is the data hierarchical? → Tree. 7) Is it a network/graph problem? → Graph. When none of the above applies, default to an Array.

2

Data Structure Time Complexity Reference

ItemData StructureKey Operations and Best Use
ArrayAccess: O(1), Search: O(n), Insert/Delete middle: O(n)Indexed access, iteration, sliding window, prefix sums
HashMap / HashSetGet/Set/Delete/Contains: O(1) avgKey-value lookups, frequency counting, deduplication
StackPush/Pop/Peek: O(1)LIFO processing, undo history, DFS, bracket matching
Queue / DequeEnqueue/Dequeue: O(1)FIFO processing, BFS, sliding window max, task scheduling
Min/Max HeapInsert: O(log n), Peek min/max: O(1), Poll: O(log n)K-th largest/smallest, Dijkstra, merge sorted lists
Sorted Set / BSTInsert/Search/Delete: O(log n)Range queries, ordered iteration, predecessor/successor
TrieInsert/Search: O(m) where m = word lengthPrefix matching, autocomplete, word dictionaries
GraphBFS/DFS: O(V + E)Connectivity, shortest path, network flow, dependency resolution
3

Pattern Recognition — When to Use What

Use HashMap when...

"Two Sum" pattern, counting frequencies, grouping elements by key, memoization cache, detecting duplicates. Any time you write "if x in array" repeatedly — convert to a Set for O(1) membership. Anagram detection, character frequency, prefix sum + complement lookup.

Use Stack when...

Matching brackets/parentheses, evaluating expressions left-to-right, "next greater element" problems, DFS traversal without recursion, backtracking implementations, undo/redo, monotonic stack for span problems.

Use Queue (Deque) when...

BFS traversal (level-order), sliding window maximum (monotonic deque), task scheduling, level-order tree traversal, shortest path in unweighted graphs. Python: collections.deque. Java: ArrayDeque. JavaScript: simulate with array.

Use Heap when...

"Top K" problems (k largest/smallest elements), merging k sorted lists, median from a data stream, Dijkstra shortest path, Prim's MST, scheduling jobs with priorities. Min-heap for k largest; max-heap for k smallest (invert values).

Use Sorted Set/TreeMap when...

Range queries with insertion/deletion, floor/ceiling lookups (find nearest value), sliding window where you need ordered access to elements, implementing a leaderboard with score updates.

Use Trie when...

Prefix searching, autocomplete systems, word validation with prefix constraints, replace with HashMap for simple cases. Key advantage: O(m) search where m is word length, independent of total number of words.

4

Code Examples by Problem Pattern

pythonCommon patterns with correct data structures
from collections import defaultdict, deque, Counter
import heapq

# ─── Pattern 1: Frequency counting (HashMap / Counter) ───────────────────────
def most_common_element(nums):
    count = Counter(nums)           # O(n) — HashMap under the hood
    return count.most_common(1)[0]  # O(n log n) for full sort, O(n) for single max

def group_anagrams(words):
    groups = defaultdict(list)
    for w in words:
        key = tuple(sorted(w))      # sort chars as the HashMap key
        groups[key].append(w)
    return list(groups.values())    # O(n * m log m) where m = avg word length

# ─── Pattern 2: Bracket matching (Stack) ─────────────────────────────────────
def is_valid_brackets(s):
    stack = []
    pairs = {')': '(', '}': '{', ']': '['}
    for char in s:
        if char in '({[':
            stack.append(char)
        elif not stack or stack[-1] != pairs[char]:
            return False
        else:
            stack.pop()
    return not stack  # O(n) time, O(n) space

# ─── Pattern 3: K largest elements (Min-Heap of size K) ──────────────────────
def k_largest(nums, k):
    """O(n log k) — much better than O(n log n) sort when k << n."""
    heap = []
    for n in nums:
        heapq.heappush(heap, n)
        if len(heap) > k:
            heapq.heappop(heap)      # remove smallest, keeping k largest
    return sorted(heap, reverse=True)

# ─── Pattern 4: BFS shortest path (Queue) ────────────────────────────────────
def bfs_shortest_path(graph, start, end):
    """O(V + E) — queue ensures shortest path in unweighted graph."""
    queue = deque([(start, 0)])
    visited = {start}
    while queue:
        node, dist = queue.popleft()
        if node == end:
            return dist
        for neighbor in graph[node]:
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append((neighbor, dist + 1))
    return -1  # no path found

# ─── Pattern 5: Sliding window max (Monotonic Deque) ─────────────────────────
def sliding_window_max(nums, k):
    """O(n) — deque maintains decreasing order within window."""
    dq = deque()  # stores indices, values decrease from front to back
    result = []
    for i, n in enumerate(nums):
        # Remove elements outside window
        while dq and dq[0] < i - k + 1:
            dq.popleft()
        # Remove smaller elements (they can never be the max)
        while dq and nums[dq[-1]] < n:
            dq.pop()
        dq.append(i)
        if i >= k - 1:
            result.append(nums[dq[0]])  # front is always the max
    return result
5

Common Interview Mistakes in Structure Choice

Array when HashSet is better

Any time you write "if x in list" in a loop, you have O(n²). Convert list to set first: seen = set(arr). Now membership check is O(1). Classic: "contains duplicate" solved in O(n) with a set, O(n²) with nested loops.

Sort when Heap is better

For "k largest" problems, sorting is O(n log n) and uses O(n) extra space. A min-heap of size k is O(n log k) and O(k) space. When k << n, the heap is significantly faster. Never sort to get only k elements from a large array.

Queue with array.shift()

JavaScript array.shift() is O(n) — it moves all elements. Use a head pointer trick or a proper deque structure. For BFS, using array with shift() makes BFS O(n²) instead of O(n). Python: use collections.deque; JavaScript: maintain a head index.

Recursive DFS without memoization

Many tree/graph problems look like DFS but have overlapping subproblems. Without memoization (a HashMap of computed results), you recompute the same paths exponentially. Always ask: "have I seen this state before?" and cache results in a HashMap.

Two-pointer + sorted array vs HashMap — space trade-off

For "find pairs that sum to target", two approaches: 1) Sort + two pointers: O(n log n) time, O(1) extra space. 2) HashMap: O(n) time, O(n) space. If the problem requires returning indices (not just values), HashMap is necessary. If only the existence of the pair matters and space is constrained, sort + two pointers wins. Always consider the space constraint when choosing between these approaches.

Frequently Asked Questions