Data structures are the building blocks of efficient programs. Understanding how to organize and store data effectively is crucial for writing fast, scalable software and acing technical interviews.
Why Data Structures Matter
Choosing the right data structure can mean the difference between a program that runs in milliseconds and one that takes hours. Data structures determine:
- Speed: How fast can we access, insert, or delete data?
- Memory: How efficiently do we use computer memory?
- Simplicity: How easy is the code to write and maintain?
Arrays
Arrays store elements in contiguous memory locations, allowing instant access to any element by its index.
Characteristics
- Access: O(1) - instant by index
- Search: O(n) - must check each element
- Insert/Delete: O(n) - may need to shift elements
# Python array (list) operations
arr = [10, 20, 30, 40, 50]
# Access by index - O(1)
print(arr[2]) # Output: 30
# Append to end - O(1) amortized
arr.append(60)
# Insert at position - O(n)
arr.insert(1, 15) # [10, 15, 20, 30, 40, 50, 60]
# Delete by index - O(n)
del arr[3] When to Use Arrays
- You need fast access by index
- Data size is known or changes infrequently
- Memory locality matters (cache-friendly)
Linked Lists
Linked lists store elements in nodes, where each node points to the next. Unlike arrays, elements aren't stored contiguously in memory.
Types of Linked Lists
- Singly Linked: Each node points to the next
- Doubly Linked: Nodes point to both next and previous
- Circular: Last node points back to the first
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
def append(self, data):
new_node = Node(data)
if not self.head:
self.head = new_node
return
current = self.head
while current.next:
current = current.next
current.next = new_node
def prepend(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node Complexity
- Access: O(n) - must traverse from head
- Insert at head: O(1)
- Insert at tail: O(n) or O(1) with tail pointer
- Delete: O(n) to find, O(1) to remove
Stacks
A stack is a Last-In-First-Out (LIFO) data structure. Think of a stack of plates - you can only add or remove from the top.
Operations
push- Add element to top - O(1)pop- Remove element from top - O(1)peek- View top element - O(1)
class Stack:
def __init__(self):
self.items = []
def push(self, item):
self.items.append(item)
def pop(self):
if not self.is_empty():
return self.items.pop()
def peek(self):
if not self.is_empty():
return self.items[-1]
def is_empty(self):
return len(self.items) == 0 Use Cases
- Function call stack (recursion)
- Undo/redo operations
- Expression evaluation
- Backtracking algorithms
Queues
A queue is a First-In-First-Out (FIFO) data structure. Like a line at a store - the first person in line is served first.
from collections import deque
class Queue:
def __init__(self):
self.items = deque()
def enqueue(self, item):
self.items.append(item)
def dequeue(self):
if not self.is_empty():
return self.items.popleft()
def front(self):
if not self.is_empty():
return self.items[0]
def is_empty(self):
return len(self.items) == 0 Use Cases
- Task scheduling
- Breadth-first search
- Print job management
- Message queues in distributed systems
Hash Tables
Hash tables (dictionaries in Python) provide near-instant lookup by converting keys into array indices using a hash function.
Complexity (Average Case)
- Access/Search: O(1)
- Insert: O(1)
- Delete: O(1)
# Python dictionary (hash table)
user = {
"name": "Alice",
"age": 30,
"email": "alice@example.com"
}
# Access - O(1)
print(user["name"])
# Insert/Update - O(1)
user["city"] = "New York"
# Delete - O(1)
del user["age"]
# Check existence - O(1)
if "email" in user:
print(user["email"]) Collision Handling
When two keys hash to the same index (collision), common strategies include:
- Chaining: Store colliding elements in a linked list
- Open addressing: Find the next available slot
Trees
Trees are hierarchical data structures with a root node and child nodes. They're fundamental for organizing hierarchical data and enabling efficient operations.
Binary Search Tree (BST)
A BST maintains the property: left children < parent < right children.
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, value):
if not self.root:
self.root = TreeNode(value)
else:
self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if value < node.value:
if node.left is None:
node.left = TreeNode(value)
else:
self._insert_recursive(node.left, value)
else:
if node.right is None:
node.right = TreeNode(value)
else:
self._insert_recursive(node.right, value)
def search(self, value):
return self._search_recursive(self.root, value)
def _search_recursive(self, node, value):
if node is None or node.value == value:
return node
if value < node.value:
return self._search_recursive(node.left, value)
return self._search_recursive(node.right, value) BST Complexity (Balanced)
- Search: O(log n)
- Insert: O(log n)
- Delete: O(log n)
Other Tree Types
- AVL Trees: Self-balancing BST
- Red-Black Trees: Balanced with color properties
- B-Trees: Used in databases and file systems
- Heaps: Complete binary tree for priority queues
Graphs
Graphs consist of vertices (nodes) connected by edges. They model relationships between objects - social networks, maps, dependencies, etc.
Representations
# Adjacency List (most common)
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B', 'F'],
'F': ['C', 'E']
}
# Adjacency Matrix
# A B C D E F
# A [[0, 1, 1, 0, 0, 0],
# B [1, 0, 0, 1, 1, 0],
# C [1, 0, 0, 0, 0, 1],
# D [0, 1, 0, 0, 0, 0],
# E [0, 1, 0, 0, 0, 1],
# F [0, 0, 1, 0, 1, 0]] Graph Traversal
# Breadth-First Search (BFS)
from collections import deque
def bfs(graph, start):
visited = set([start])
queue = deque([start])
while queue:
vertex = queue.popleft()
print(vertex, end=' ')
for neighbor in graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
# Depth-First Search (DFS)
def dfs(graph, start, visited=None):
if visited is None:
visited = set()
visited.add(start)
print(start, end=' ')
for neighbor in graph[start]:
if neighbor not in visited:
dfs(graph, neighbor, visited) Choosing the Right Data Structure
| Use Case | Best Choice | Why |
|---|---|---|
| Fast lookup by key | Hash Table | O(1) average access |
| Ordered data with fast search | BST / Balanced Tree | O(log n) operations, maintains order |
| LIFO operations | Stack | O(1) push/pop |
| FIFO operations | Queue | O(1) enqueue/dequeue |
| Frequent insertions/deletions | Linked List | O(1) insert/delete at known position |
| Fast access by index | Array | O(1) random access |
| Relationships/connections | Graph | Models connections naturally |
Next Steps
- Algorithms - Learn sorting, searching, and problem-solving techniques
- Interview Prep - Practice data structure problems