Choosing the Right Container
What You'll Learn
How to choose the right data structure for any task — and why the choice matters for performance, safety, and readability.
The Decision Guide
Do you need key-value pairs?
YES → dict (or defaultdict, Counter, OrderedDict)
NO ↓
Do values need to be unique?
YES → set (or frozenset if immutable)
NO ↓
Is the order fixed and the data constant?
YES → tuple
NO → list
Side-by-Side Comparison
| Feature | list | tuple | set | dict |
|---|---|---|---|---|
| Ordered | ✅ | ✅ | ❌ | ✅ (Python 3.7+) |
| Mutable | ✅ | ❌ | ✅ | ✅ |
| Allows duplicates | ✅ | ✅ | ❌ | keys: ❌, values: ✅ |
Fast lookup (in) | ❌ O(n) | ❌ O(n) | ✅ O(1) | ✅ O(1) keys |
| Use as dict key | ❌ | ✅ | ❌ | ❌ |
| Indexable | ✅ | ✅ | ❌ | ✅ by key |
When to Use Each
Use a list when:
- Order matters and you'll add, remove, or change items
- You need indexing:
items[0],items[-1] - You have duplicate values you want to keep
# Shopping cart — order and duplicates matter
cart = ["apple", "bread", "apple", "milk"]
cart.append("eggs")
cart.remove("apple") # removes first occurrence
Use a tuple when:
- Data is fixed and shouldn't change
- Returning multiple values from a function
- Storing coordinate pairs, records, or constants
# Fixed record
point = (37.7749, -122.4194) # (latitude, longitude)
# Return multiple values
def get_bounds(data):
return min(data), max(data)
low, high = get_bounds([3, 1, 4, 1, 5])
# Named tuples for clarity
from collections import namedtuple
Point = namedtuple("Point", ["x", "y"])
p = Point(10, 20)
print(p.x, p.y) # 10 20
Use a set when:
- You need unique values only
- You need fast membership testing (
in) - You're doing mathematical set operations (union, intersection)
# Remove duplicates from a list
items = [3, 1, 4, 1, 5, 9, 2, 6, 5]
unique = list(set(items))
# Fast membership test on large data
valid_ids = {1001, 1002, 1003, 1004, 1005}
if user_id in valid_ids: # O(1) — instant
allow_access()
# Find common elements
python_users = {"alice", "bob", "charlie"}
js_users = {"bob", "diana", "charlie"}
both = python_users & js_users # {'bob', 'charlie'}
Use a dict when:
- You need to look up values by a key (name, ID, config key)
- You're building lookup tables, counters, or config objects
- You're grouping data by a category
# Lookup table
country_codes = {"US": "United States", "GB": "United Kingdom", "DE": "Germany"}
print(country_codes["US"])
# Counter
from collections import Counter
word_freq = Counter("the quick brown fox jumps over the lazy dog".split())
# Config
config = {
"host": "localhost",
"port": 5432,
"timeout": 30,
}
Collections Module — Specialized Containers
namedtuple — Readable Records
from collections import namedtuple
# Instead of (95, "Alice", "A")
Record = namedtuple("Record", ["score", "name", "grade"])
r = Record(score=95, name="Alice", grade="A")
print(r.name) # Alice
print(r.score) # 95
print(r) # Record(score=95, name='Alice', grade='A')
deque — Efficient Queue
from collections import deque
# List: O(n) to insert at front — use deque instead
queue = deque()
queue.append("task1") # add to right
queue.append("task2")
queue.appendleft("urgent") # add to left (O(1))
print(queue.popleft()) # remove from left (O(1)): "urgent"
# Fixed-size buffer (last N items)
recent = deque(maxlen=5)
for i in range(10):
recent.append(i)
print(list(recent)) # [5, 6, 7, 8, 9]
OrderedDict — Guaranteed Order (Pre-3.7)
Python dicts maintain insertion order since 3.7. Use OrderedDict only when you need move_to_end():
from collections import OrderedDict
cache = OrderedDict()
cache["a"] = 1
cache["b"] = 2
cache.move_to_end("a") # move to end (for LRU cache pattern)
Performance Summary
| Operation | list | tuple | set | dict |
|---|---|---|---|---|
x in items | O(n) slow | O(n) slow | O(1) fast | O(1) fast (keys) |
| Append | O(1) | N/A | O(1) | O(1) |
| Insert at front | O(n) slow | N/A | N/A | N/A |
| Index access | O(1) | O(1) | N/A | O(1) |
| Memory | Medium | Low | Medium | High |
Common Patterns
# Deduplicate while preserving order
seen = set()
unique = [x for x in items if x not in seen and not seen.add(x)]
# Frequency count
from collections import Counter
freq = Counter(items)
top3 = freq.most_common(3)
# Group by key
from collections import defaultdict
groups = defaultdict(list)
for item in items:
groups[item["category"]].append(item)
# Fixed-size sliding window
from collections import deque
window = deque(maxlen=10)
# Named record
from collections import namedtuple
User = namedtuple("User", ["id", "name", "email"])
Quick Cheatsheet
list → ordered, mutable, duplicates OK
tuple → ordered, immutable, use for fixed records
set → unordered, unique, fast membership
dict → key → value, fast lookup
# Specialized
from collections import (
namedtuple, # readable records
deque, # fast queue/deque
defaultdict, # auto-default values
Counter, # count occurrences
OrderedDict, # ordered + move_to_end
)