Effective Python: zip
, enumerate
, and iter
This short post looks at three built-in Python functions: zip
, enumerate
, and iter
.
Let’s make an iterable, specifically a list, we can use in our examples.
= [chr(65+i) for i in range(26)]
lst
lst# A, B, ..., Z
In old programming languages we iterate over lst
using a loop and accessing each element separately.
for i in range(len(lst)):
print(i, lst[i])
A modern, Pythonic approach iterates directly over the list.
= 0
i for x in lst:
print(i, x)
+= 1 i
Keeping track of the index manually is ugly. Is there a better way? One straw-man takes pairs from a range
and lst
. The zip
function does just that: it produces an iterable yielding pairs (or generally tuples if you pass more arguments) until the shortest argument is exhausted. We can write:
for i, x in zip(range(len(lst)), lst):
print(i, x)
To see what zip
does, evaluate it into a list:
list(zip(range(4), lst[:4]))
# [(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D')]
This pattern is so common that Python provides the enumerate
function to achieve the same thing.
for i, x in enumerate(lst):
print(i, x)
Again, to see what’s going on:
list(enumerate(lst[:4]))
# [(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D')
A handy use of zip
is to create a dictionary from two iterables: one for keys and the other for values. For instance:
= dict(zip(lst[::-1], lst))
d
d# {'Z': 'A', 'Y': 'B', 'X': 'C', ... }
We can iterate over key/value pairs using the dict.items
function.
for k, v in d.items():
print(k, v)
What if we want the key/value pairs enumerated? A first guess might be:
for i, k, v in enumerate(d.items()):
print(i, k, v)
# ValueError: not enough values to unpack (expected 3, got 2)
Again, forcing the result to a list shows what is going on:
list(enumerate(d.items()))
# [(0, ('Z', 'A')), (1, ('Y', 'B')), (2, ('X', 'C')), ...
The enumeration produces pairs of values, an int
and a tuple
. The correct solution is
for i, (k, v) in enumerate(d.items()):
print(i, k, v)
The iter
function creates an iterator from an iterable, that is, an object on which next
can be called. Iterators have many uses (this post shows how to use them to create graphics, for one). One clever application is to serve up elements of an iterator \(n\) at a time. For example, to get letters 5 at a time from lst
we can use this pattern:
for v, w, x, y, z in zip(*[iter(lst)]*5):
print(v, w, x, y, z)
How does this work? Let’s expand the code. It is equivalent to
= iter(lst)
it for v, w, x, y, z in zip(it, it, it, it, it):
print(v, w, x, y, z)
As zip
runs, it calls next
on the same iterator it
. The list [iter(lst)]*5
duplicates the same iterator five times. Calling as zip(*[iter(lst)]*5)
expands the list into separate arguments for zip
. We might quibble this pattern misses Z
because zip
stops when the shortest argument is exhausted.
The itertools
library contains a version of zip
that continues until the longest argument is exhausted, and returns a fillvalue
to pad out the results.
from itertools import zip_longest
for i in zip_longest(*[iter(lst)]*5, fillvalue='-'):
print(i)
# ('A', 'B', 'C', 'D', 'E')
# ('F', 'G', 'H', 'I', 'J')
# ('K', 'L', 'M', 'N', 'O')
# ('P', 'Q', 'R', 'S', 'T')
# ('U', 'V', 'W', 'X', 'Y')
# ('Z', '-', '-', '-', '-')
I find zip
, enumerate
, and iter
to be very handy, and use them frequently.
Happy coding!