The Easiest Explanation of Python's Itertools and its Applications.
What is an iterator?
Do not judge a book by its cover, they say…however, in this book, we will unravel as much as we can before delving into the subject's intricacies — this should be quite a lengthy read, so buckle up! 💪🏾
According to w3schools, an iterator, as applied in Python, is an object that contains a countable number of values. This means this object can be traversed and iterated upon, and action can be taken on each item in the iterable. Moving forward, the term iterable will signify any object that can be traversed.
What are Python's itertools?
Itertools, as applied in Python and according to its documentation, is a module (this means it should be imported at the top of the python program) that standardizes a core set of fast, memory-efficient tools useful by themselves or in combination.
Itertool methods are reasonably categorized into the following:
- Infinite iterators
- Iterators that terminate on the shortest input sequence
- Combinatoric iterators
Am I the only one who has a weird unstoppable feeling when using such tools? Somewhat like I am superman!🤔💭😅
Infinite Iterators
- itertools.count(start, [step]): makes an infinite iterator that returns values starting from the start argument and incrementing by the step (if provided), else should naturally increment by 1.
Note, Remember to provide a condition to break out of this infinite loop.
import itertools
# Example 1
fruits = ["Cassava", "Onions", "Gumbear"] # These are not actual fruits o 😂
for i in itertools.count(start=0): # If step is not provided, incremments by 1
if i > len(fruits)-1: # Condition to break from the infite loop
break
else:
print(fruits[i])
# Output:
Cassava
Onions
Gumbear
# Example 2. Ps: Don't do this else you get in an infite loop!!!
for foo in itertools.count(5, 15): # This initiates a counter from 5, adding 15 by every count.
print(foo)
# Output:
5
20
35
50
65
...
- itertools.cycle(iterable): Makes an infinite iterator that returns individual elements from the iterable, then saves a copy of each. When the iterable is exhausted, it returns elements from the saved copy — repeats indefinitely unless terminated. Note that this particular itertool (cycle) only accepts one argument — the iterable.
You can, however, improvise a count variable, as seen in the example below. This should help keep track of the number of cycles — here, a complete cycle is when all elements in the iterable have been returned (usually the same as the length of the iterable) - itertools.repeat(iterable, [number_of_times]): Makes an iterator that returns the whole iterable passed as an argument repeatedly. It runs indefinitely unless the number_of_times argument is specified.
itertools.cycle and itertools.repeat can be confusing, but below is detail difference between the two itertools in use — which should help a lot.
import itertools
# Itertools.cycle
foo = ["Leah", 1, 2, 3]
# Improvised counter. An incremement in value does not reflect a cycle.
count = 0
# Example 1
for i in itertools.cycle(foo):
if count==3: # Intentional break before a complete cycle (return of all elements in the iterable[list])
break
else:
print(i)
count += 1
# Output
Leah
1
2
# Example 2
foo = "Dont Fight The Night!"
count = 22 # Length of the Iterable, foo...a complete cycle.
for i in itertools.cycle(foo):
if count>0:
print(i)
count -= 1
else:
break
# Output
D
o
n
t
F
i
g
h
t
T
h
e
N
i
g
h
t
!
import itertools
# Itertools.repeat
foo = ["Leah", 1, 2, 3]
for i in itertools.repeat(foo, 4):
print(i)
# Output
["Leah", 1, 2, 3]
["Leah", 1, 2, 3]
["Leah", 1, 2, 3]
["Leah", 1, 2, 3]
Iterators that terminate on the shortest input sequence
- itertools.accumulate(iterable, [func=operator.add], *, [initial=None]): Makes an iterator that returns accumulated sums (by default) or accumulated results of other binary functions (specified via the optional func argument). If func is supplied, it should be a function of two arguments. Elements of the input iterable may be any type that can be accepted as arguments to func. (For example, with the default operation of addition, elements may be any addable type, including
Decimal
,String
orFraction
.)
Usually, the number of elements output matches the input iterable. However, if the keyword argument initial is provided, the accumulation leads off with the initial value so that the output has one more element than the input iterable. Note, in the absence of an initial argument, the first item in the iterable is returned without any action done to it. If an initial is provided, it will be returned untouched first before being accumulated depending on the function(defaults to add if none is provided).
import itertools, operator
foo = [2,3,4,5,6]
# Example 1 using the operator in-built module
for i in itertools.accumulate(foo, operator.mul, initial=10):
print(i)
# Output
10
20
60
240
1200
7200
# Example 2 passing a function
for i in itertools.accumulate(foo, lambda x, y: int((x * y)/2)):
print(i)
# Output
2
3
6
15
45
# Example 3 convert iterable to list and return last element
foo = ["Miss.", "Pamela"]
print(list(itertools.accumulate(foo, lambda x, y: x + " " + y, initial="Hello"))[-1])
# Output
Hello Miss. Pamela
- itertools.chain(*iterables): Makes an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence — it means you can perform the same action on items in totally different iterables.
A shorter explanation is that this tool combines items in different iterables into a single iterable. - itertools.chain.from_iterable(iterable): This itertool is very similar to itertools.chain. The difference, however, is that whilst the former accepts multiple iterables as argument(does not care about the data type of items in each iterable)…combining them, this itertool (itertools.chain.from_iterable) accepts a single iterable…whose items must in their nature be iterables, it then combines the items into a single iterable.
A shorter explanation is that this itertool demands the items to be iterable objects. Then it traverses the items one after the other, acting on the first item's items before moving to the next.
Here’s some code snippet explaining the difference between these two itertools
import itertools
#itertools.chain
foo = [2,4,6,8]
boo = [3,10,7,14]
chained_iterable = itertools.chain(foo, boo)
# Example 1. - itertools.chain can behave like adding iterables
print(list(chained_iterable))
# Output
[2,4,6,8,3,10,7,14]
# Example 2. - taking mutual action on multiple iterable items
for i in chained_iterable:
if not i%2:
print(i)
else:
continue
# Output
2
4
6
8
10
14
import itertools
#itertools.chain.from_iterable
# Example 1. - This will throw an error because the item(s) (int) are not iterables
foo = [1,2,3,4, "Joan"]
boo = itertools.chain.from_iterable(foo)
print(list(boo))
# Output
TypeError: 'int' object is not iterable
# Example 2. -
foo = [[1,2,3], {4,5,6}, "Nkechi"]
boo = itertools.chain.from_iterable(foo)
print(list(boo))
# Output
[1, 2, 3, 4, 5, 6, 'N', 'k', 'e', 'c', 'h', 'i']
- itertools.compress(data, selectors): Makes an iterator that filters elements from data returning only those that have a corresponding element in selectors that evaluates to
True
. Stops when either the data or selectors iterables have been exhausted. This acts like a zip function, where items in the data argument map to items in the selectors argument. However, items are only returned if the corresponding items in selectors are Truthy.
import itertools
# Example 1. - Using True or False keyword
foo = [1,2,3,4,5,6,7,8,9]
boo = [True if not i%2 else False for i in foo]
x = itertools.compress(foo, boo)
print(list(x))
# Output
[2,4,6,8]
# Example 2. - Using Truthy, Falsy values | True and False Keyword
foo = itertools.compress(["A","B","C","D"], [1,0,True,False])
print(list(foo))
# Output
["A","C"]
- itertools.dropwhile(predicate, iterable): With this itertool, the predicate act as a filter and usually is an anonymous function (lambda), while the iterable is iterated upon. As long as the condition in the predicate evaluates to True, items in the iterable will be dropped (removed/skipped). Once an item in the iterable evaluates to False by the predicate, that item, including every other item following it, will be returned. Note that the iterator does not produce any output until the predicate first becomes false (this means it may have a lengthy start-up time).
import itertools
foo = ["Luthor", "Clark", "Rose", "Matty", "Zord"]
boo = itertools.dropwhile(lambda x: x!="Rose", foo)
print(list(boo))
# Output
["Rose", "Matty", "Zord"]
- itertools.filterfalse(predicate, iterable): Makes an iterator that filters elements from iterable returning only those for which the predicate is
False
. If the predicate isNone
, return the false items.
The difference between itertools.dropwhile and itertools.filterfalse is; for the former, once the predicate is False, the items that fall in the category are returned as well as every other item after it (no other check is done once this is the case). Whereas, the latter only returns all items in the iterable where the predicate is False (filtering continues for the rest of the items even after the predicate of one item is False).
import itertools
# Example 1. - returns only elements when applied on the predicate, have Falsy value of zero (0).
# In this case, 0 or 1 after modulos.
foo = [1,2,3,4,5,6,7,8,9]
boo = itertools.filterfalse(lambda x: x%2, foo)
print(list(boo))
# Output
[2,4,6,8]
# Example 2. If the predicate is None, returns False items.
foo = [1,0,1,0,1]
boo = itertools.filterfalse(None, foo)
print(list(boo))
# Output
[0,0]
- itertools.groupby(iterable, key_func=None): Makes an iterable that returns consecutive keys and groups from the iterable. Where key_func is not specified or is
None,
key defaults to an identity function and returns the elements unchanged. Note that the iterable needs to be sorted.
import itertools
# Example 1. - Iterable that is not sorted
foo = [("Mujikeni INC", "DevOps"),("Thelms Tech", "DesignOps"),("Mujikeni INC", "DevRel"),("Thelms Tech", "Product Management")]
boo = itertools.groupby(foo, lambda x: x[0]) # Indicates the key to be the first element in the tuple.
for k,g in boo:
print({k: list(g)})
# Output
{'Mujikeni INC': [('Mujikeni INC', 'DevOps')]}
{'Thelms Tech': [('Thelms Tech', 'DesignOps')]}
{'Mujikeni INC': [('Mujikeni INC', 'DevRel')]}
{'Thelms Tech': [('Thelms Tech', 'Product Management')]}
# Example 2. - Sorted iterable
foo = [("Mujikeni INC", "DevOps"),("Thelms Tech", "DesignOps"),("Mujikeni INC", "DevRel"),("Thelms Tech", "Product Management")]
boo = itertools.groupby(sorted(foo), lambda x: x[0]) #Sorts the iterable and provide key_func
for k,g in boo:
print({k: list(g)})
# Output
{'Mujikeni INC': [('Mujikeni INC', 'DevOps'), ('Mujikeni INC', 'DevRel')]}
{'Thelms Tech': [('Thelms Tech', 'DesignOps'), ('Thelms Tech', 'Product Management')]}
# Example 3. Key_func not provided
foo = [("Mujikeni INC", "DevOps"),("Thelms Tech", "DesignOps"),("Mujikeni INC", "DevRel"),("Thelms Tech", "Product Management")]
boo = itertools.groupby(sorted(foo))
for k,g in boo:
print({k: list(g)})
# Output
{('Mujikeni INC', 'DevOps'): [('Mujikeni INC', 'DevOps')]}
{('Mujikeni INC', 'DevRel'): [('Mujikeni INC', 'DevRel')]}
{('Thelms Tech', 'DesignOps'): [('Thelms Tech', 'DesignOps')]}
{('Thelms Tech', 'Product Management'): [('Thelms Tech', 'Product Management')]}
- itertools.islice(iterable, start, stop, [step]): Acts almost naturally to Python’s slice function. However, this itertool does not support negative values for start, stop or end. Note that start and stop are index values.
import itertools
foo = "Lewandowski"
# Example 1. - Start is None
boo = itertools.islice(foo, None)
print(list(boo))
# Output
['L', 'e', 'w', 'a', 'n', 'd', 'o', 'w', 's', 'k', 'i']
# Example 2.
boo = itertools.islice(foo, 0, None, 2)
print(list(boo))
# Output
['L', 'w', 'n', 'o', 's', 'i']
- itertools.pairwise(iterable): Returns successive overlapping pairs taken from the input iterable. Note, the number of 2-tuples in the output iterator will be one fewer than the number of input — example below. Iterator will be empty if the input iterable has less than two values.
import itertools
foo = [1,2,3,4,5,6,7,8,9] # Number of items are nine
boo = itertools.pairwise(foo)
print(list(boo))
# Output
# Number of two tuples are eight
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9)]
foo = [1] # Number of input in iterable has less than two values
boo = itertools.pairwise(foo)
print(list(boo))
# Output
[]
- itertools.starmap(function, iterable): Makes an iterator that computes the function using arguments obtained from the iterable. Used instead of
map()
when argument parameters are already grouped in tuples from a single iterable.
The difference betweenmap()
andstarmap()
is that map() considers each item (tuples) within the list as a single argument where as starmap() is capable of unpacking each item (tuples) whilst using that (the unpacked tuple) as argument for the function.
Below is example of the difference between map and itertools.starmap
import itertools
foo = [(2,4), (3,5), (4,6), (5,7), (6,8)]
# Example 1. Using itertools.starmap
boo = itertools.starmap(lambda x, y: x**y, foo)
print(list(boo))
# Output
[16, 243, 4096, 78125, 1679616]
# Example 2. Using map
boo = map(lambda x,y: x**y, foo)
print(list(boo)) # At this point, it is unable to unpack the individual tuples
# Output
TypeError: <lambda>() missing 1 required positional argument: 'y'
- itertools.takewhile(predicate, iterable): Very similar, yet opposite to itertools.dropwhile. This itertool only begins to return the element(s) for which the predicate evaluate to True. For this itertool to return a value, the first element in the iterable must be such that the predicate evaluates to True.
However, if the first item in the element is such that the predicate evaluates to False, an empty iterable will be returned. Elements in the iterable must be ordered in such a manner that consecutive elements evaluate to True by the predicate (only these elements will be returned). If this is not the case, and there exist elements for which the predicate is False and in-between elements which predicate is True — only consecutive elements which predicate evaluate to True will be returned.
from itertools import takewhile
# Example 1. - Non-consecutive elements which predicate evaluate to True.
foo = [5,6,5,7,8,2,9,2,3,4,1,2,4,5]
boo = takewhile(lambda x: x==5, foo)
print(list(boo))
# Output
[5] # Other elements which predicate is True is not returned becuase they are consecutive.
# Example 2. - Consecutive and non-consecutive elements which predicate evalute to True.
foo = [1,4,5,5,4,6,7,8,9]
boo = takewhile(lambda x: x<=4, foo)
print(list(boo))
# Output
[1,4]
# Example 3. Element which predicate evaluate to False being the first element in the iterable.
foo = [2,4,4,5,6,7]
boo = takewhile(lambda x: x==4, foo)
print(list(boo))
# Output
[]
- itertools.tee(iterable, n=2): Returns n (n defaults to 2 if not specified) independent iterators from a single iterable.
from itertools import tee
# Example 1. - n is not specified
foo = [2,4,6,8]
boo = tee(foo)
for i in boo:
print(list(i))
# Output
[2,4,6,8]
[2,4,6,8]
# Example 2. - n is specified
boo = tee(foo, 4)
for i in boo:
print(list(i))
# Output
[2, 4, 6, 8]
[2, 4, 6, 8]
[2, 4, 6, 8]
[2, 4, 6, 8]
- itertools.zip_longest(*iterables, fillvalue=None): Very similar to zip, but this itertool will continue to pack (group) elements in various iterables together regardless of their difference in uniformity of length. By default, once either of the elements in the iterables are exhausted, the value will be replaced with fillvalue (defaults to None).
import itertools
foo = [1,3,5]
boo = [2,4]
# Example 1. - fillvalue not provided
x = itertools.zip_longest(foo, boo)
print(list(x))
# Output
[(1, 2), (3, 4), (5, None)]
# Example 2. - fillvalue provided
x = itertools.zip_longest(foo, boo, fillvalue=6)
print(list(x))
# Output
[(1, 2), (3, 4), (5, 6)]
Combinatoric Iterators
- itertools.product(*iterables, repeat=1): This itertool behaves very similar to nested for loops logic. It returns the cartesian product of the provided iterable with itself or other provided iterables for the number of times specified by the optional keyword argument repeat. For example, itertools.product(arr1, arr2, repeat=2) means the same as itertools.product(arr1, arr2, arr1, arr2).
import itertools
foo = [1,2,3]
boo = [4,5,6]
x = itertools.product(foo,boo, repeat=2)
print(list(x))
[(1, 4, 1, 4), (1, 4, 1, 5), (1, 4, 1, 6), (1, 4, 2, 4), (1, 4, 2, 5), (1, 4, 2, 6), (1, 4, 3, 4), (1, 4, 3, 5), (1, 4, 3, 6), (1, 5, 1, 4), (1, 5, 1, 5), (1, 5, 1, 6), (1, 5, 2, 4), (1, 5, 2, 5), (1, 5, 2, 6), (1, 5, 3, 4), (1, 5, 3, 5), (1, 5, 3, 6), (1, 6, 1, 4), (1, 6, 1, 5), (1, 6, 1, 6), (1, 6, 2, 4), (1, 6, 2, 5), (1, 6, 2, 6), (1, 6, 3, 4), (1, 6, 3, 5), (1, 6, 3, 6), (2, 4, 1, 4), (2, 4, 1, 5), (2, 4, 1, 6), (2, 4, 2, 4), (2, 4, 2, 5), (2, 4, 2, 6), (2, 4, 3, 4), (2, 4, 3, 5), (2, 4, 3, 6), (2, 5, 1, 4), (2, 5, 1, 5), (2, 5, 1, 6), (2, 5, 2, 4), (2, 5, 2, 5), (2, 5, 2, 6), (2, 5, 3, 4), (2, 5, 3, 5), (2, 5, 3, 6), (2, 6, 1, 4), (2, 6, 1, 5), (2, 6, 1, 6), (2, 6, 2, 4), (2, 6, 2, 5), (2, 6, 2, 6), (2, 6, 3, 4), (2, 6, 3, 5), (2, 6, 3, 6), (3, 4, 1, 4), (3, 4, 1, 5), (3, 4, 1, 6), (3, 4, 2, 4), (3, 4, 2, 5), (3, 4, 2, 6), (3, 4, 3, 4), (3, 4, 3, 5), (3, 4, 3, 6), (3, 5, 1, 4), (3, 5, 1, 5), (3, 5, 1, 6), (3, 5, 2, 4), (3, 5, 2, 5), (3, 5, 2, 6), (3, 5, 3, 4), (3, 5, 3, 5), (3, 5, 3, 6), (3, 6, 1, 4), (3, 6, 1, 5), (3, 6, 1, 6), (3, 6, 2, 4), (3, 6, 2, 5), (3, 6, 2, 6), (3, 6, 3, 4), (3, 6, 3, 5), (3, 6, 3, 6)]
- itertools.permutations(iterable, r=None): Returns successive r length permutations of elements in the iterable. If r is not specified or is
None
, then r defaults to the length of the iterable and all possible full-length permutations are generated.
The permutation tuples are emitted in lexicographic order according to the order of the input iterable. So, if the input iterable is sorted, the output tuples will be produced in sorted order.
Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeated values within a permutation. - itertools.combinations(iterables, r): Return r length subsequences of elements from the input iterable.
The combination tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the output tuples will be produced in sorted order. Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeated values in each combination.
A visual difference of the two itertools is that, in permutaion, the iterable “ABCD” ( itertools.permutations(“ABCD”, 2) ) would evaluate to AB AC AD BA BC BD CA CB CD DA DB DC. Whereas, in combination, the same iterable “ABCD” ( itertools.combinations(“ABCD”, 2) ) would evaluate to AB AC AD BC BD CD ( here, AB and BA are considered same combination and therefore not repeated).
itertools.permutatons is the arrangement of items of an iterable in groups of r length. Multiple groups may have same items as long as their positions in the groups differ (AB != BA) or are repeated items in the iterable.
itertools.combinations is the grouping (groups are of r length) of combinations of items of an iterable. Multiple groups can not have same items regardless of position of items (AB == BA) differ or are repeated items in the iterable.
# itertools.permutations
import itertools
foo = [1,2,3,4]
# Example 1. - r is not provided thereby defaults tot he length of the iterable.
boo = itertools.permutations(foo)
print(list(boo))
# Output
[(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2), (2, 1, 3, 4), (2, 1, 4, 3), (2, 3, 1, 4), (2, 3, 4, 1), (2, 4, 1, 3), (2, 4, 3, 1), (3, 1, 2, 4), (3, 1, 4, 2), (3, 2, 1, 4), (3, 2, 4, 1), (3, 4, 1, 2), (3, 4, 2, 1), (4, 1, 2, 3), (4, 1, 3, 2), (4, 2, 1, 3), (4, 2, 3, 1), (4, 3, 1, 2), (4, 3, 2, 1)]
# Example 2. - r is provided
boo = itertools.permutations(foo, 2)
print(list(boo))
# Output
[(1, 2), (1, 3), (1, 4), (2, 1), (2, 3), (2, 4), (3, 1), (3, 2), (3, 4), (4, 1), (4, 2), (4, 3)]
# Example 3.
foo = "ABCD"
boo = itertools.permutations(foo, 2)
print(list(boo))
# Output
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'A'), ('B', 'C'), ('B', 'D'), ('C', 'A'), ('C', 'B'), ('C', 'D'), ('D', 'A'), ('D', 'B'), ('D', 'C')]
# itertools.combinations
import itertools
boo = itertools.combinations("ABCD", 2)
print(list(boo))
# Output
[('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]
- itertools.combinations_with_replacement(iterable, r): This itertool is very similar to itertools.combinations. However, this itertool allows individual elements to be repeated more than once.
import itertools
boo = itertools.combinations("ABCD", 2)
print(list(boo))
# Output
[('A', 'A'), ('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'B'), ('B', 'C'), ('B', 'D'), ('C', 'C'), ('C', 'D'), ('D', 'D')]
Hey, thank you for reading through! Writing is one way to consolidate what you know. I’d love to receive feedback or suggestion where necessary on how this article can be improved — I also challenge you to start writing or contributing to topics that you are passionate about. If this article was helpful, kindly leave a clap, follow and subscribe to get informed of future articles.
Yours’
Sam, The Empathetic Dev 💜