Itertools

by Alex
Itertools

The standard library of Python functions allows the programmer to create and manipulate the defined object sequences. With simple iterations in a loop you can fill arrays with content, and with list generators you can set more complex conditions for generating them. The itertools plugin allows you to extend this functionality.

What is itertools?

This extension is a collection of useful iterators that increase the efficiency of your work with loops and object sequence generators. This is achieved through better memory management in the program, fast execution of plug-in functions, as well as reducing and simplifying the code. The ready methods implemented in this library accept various parameters to control the sequence generator in order to return the calling subroutine the necessary set of objects. This article discusses the itertools module present in Python version 3, although it is also available for Python 2. In order to use its features, you must import the library by specifying a method to be called in the program. For example, to call the product function, the following instruction must be placed at the beginning of the file: from itertools import product. Then the programmer will be able to refer to the method by its name. If you want to use several functions – you can list their names separated by commas. You can also connect the itertools module in Python simply by writing import itertools * at the beginning of the program. If you do that, you will need to refer to the same product function as follows: itertools.product( [function arguments] ).

Infinite iteration

There are currently three iterator functions that are not automatically interrupted. These include methods:

  • count;
  • cycle;
  • repeat.

These methods can be used to generate objects or perform certain actions an unlimited number of times. This means that the programmer will need to interrupt the generated loop himself.

count

This method creates a uniformly distributed sequence by generating objects using one or two user parameters. The first argument here is the starting value of the dataset, and the second (optional) is the length of the constant step. The following example shows how this method works in a small loop.

from itertools import count
for i in count(0, 2):
    if i >= 10:
        break 
    else:
        print(i)

0
2
4
6
8

As you can see from the results of the program, the for loop works with the function count, which in turn gets the starting value of the sequence 0 and the length of step 2. The variable named i is a temporary storage for each new number. The body of the loop uses an if construction that limits the action of the generator to a value of 10. If in the current iteration i is less than or equal to 10, the loop is broken with break. Otherwise the value is printed using the print function.

cycle

This next iterator allows you to create an endless loop, which will alternately print some symbols or numbers. The argument is an object or a collection of objects, which can be listed one by one. The code below shows how the cycle function works with the DOG string in a for loop.

from itertools import cycle
count = 1
for i in cycle('DOG'):
    if count > 5:
        break
    print(i)
    count += 1

D
O
G
D
O

Thus, the result of the program is the alternate output of the characters of the string which is the argument of the cycle method. Since this iterator also has no automatic limits on the number of new objects, it is worth using a counter to stop it. Using the variable count, which increases its value by 1 for every cycle step, this problem is solved quite easily.

repeat

The last of these iterators repeats the object that was passed to the method as the first parameter. The second argument is the number of identical elements in the sequence being created. The following example shows how to populate a list called data with a for loop generator. The DOG string, which is added to the sequence exactly three times, serves as an object here.

from itertools import repeat
data = [i for i in repeat('DOG', 3)]
print(data)

['DOG', 'DOG', 'DOG']

The results of the program’s work are shown using the print method, which gets a list of data ready to be printed on the screen. The first parameter of the repeat function can be a string, as well as a number, a symbol, or any other list with any data.

Combination of values

There are currently only four iterator functions that allow you to combine different values by swapping values. These include methods such as:

  • combinations;
  • combinations_with_replacement;
  • permutations;
  • product.

combinations

The first function to combine the individual elements of a sequence takes two arguments, as do all the following ones. The first allows you to set a specific object, and the second one sets the number of values that will be present in each new segment. This example demonstrates how the combinations function of the itertools library works in Python to create a list.

from itertools import combinations
data = list(combinations('DOG', 2))
print(data)

[('D', 'O'), ('D', 'G'), ('O', 'G')]

As you can see from the code, the method gets the string DOG, which is then decomposed into individual characters. Next, it is grouped by 2 letters, so that each new sample is different from all existing ones. The print function prints the resulting data list on the screen, displaying all the generated character pairs D, O, G.

combinations_with_replacement

A more advanced variation of the previous iterator gives the program the ability to select from individual elements based on their order. The following sample code shows the use of combinations_with_replacement with already known arguments.

from itertools import combinations_with_replacement
for i in combinations_with_replacement('DOG', 2):
print(''.join(i))

DD
DO
DG
OO
OG
GG

As a result of the program, several groups of elements, not repeating each other’s order, are displayed on the screen. The same objects may well be used if their overall arrangement does not coincide with previous selections.

permutations

The permutations function of the itertools module in Python works similarly to the order reversal combination. However, it does not allow identical items to be placed in the same group. Below is code demonstrating the behavior and result of this method in a for loop.

from itertools import permutations
for i in permutations('DOG', 2):
    print(''.join(i))

DO
DG
OD
OG
GD
GO

The program outputs several pairs of values, since the function received 2 as the second argument. It is important to note that each new sample differs from all the previous ones only in order, and the overall sequence can include groups consisting of the same values, just changing the arrangement.

product

The last of the combinatorial iterators receives as a parameter an array of data consisting of several groups of values. The product function of the itertools library in Python 3 allows you to obtain a new set of groups, in all possible variations, from an input sequence of numbers or characters. The following example shows the implementation of this method.

from itertools import product
data = list(product((0, 1), (2, 3))
print(data)

[(0, 2), (0, 3), (1, 2), (1, 3)]

This produces a new sequence of data that contains all possible combinations of values from the initial list. As in the other examples, the print function prints all of its contents on the screen.

Sequence Filtering

The filtering tools are also used to manipulate the data in a list or any other sequence of values. Some of the functions in itertools can automatically remove individual items that don’t satisfy the conditions you set. There are currently only four such iterators:

  • filterfalse;
  • dropwhile;
  • takewhile;
  • compress.

filterfalse

To create a new list from an existing sequence of objects, you can use the filterfalse method. The first argument is a test function, which returns True or False. The second argument is a list of some objects that you want to filter by using the result of the checking function.

from itertools import filterfalse
data = list(filterfalse(lambda i: i == 0, [1, 2, 3, 0, 4, 5, 1]))
print(data)

[1, 2, 3, 4, 5, 1]

As you can see from the example, the lambda function checks if it is equal to zero. The elements of the sequence, for which this check returns False, are put into a new list, and then they are shown on the screen.

dropwhile

The next function works in a slightly different way, although the same scheme is used. The dropwhile iterator checks the Boolean value returned as the first parameter for each item in the sequence and, if it is False, puts it into a new list and everything that comes after.

from itertools import dropwhile
data = list(dropwhile(lambda i: i != 0, [1, 2, 3, 0, 4, 5, 1]))
print(data)

[0, 4, 5, 1]

This example shows that the lambda function is used to check for an inequality to zero. After 0 is found in the sequence, all subsequent numeric values are stored in a new list.

takewhile

The takewhile iterator works in exactly the opposite way, writing into the array only those elements that went before the check function returned False. The following example shows how this method works.

from itertools import takewhile
data = list(takewhile(lambda i: i != 0, [1, 2, 3, 0, 4, 5, 1])
print(data)

[1, 2, 3]

As you can see, the resulting list got the values that went before 0.

compress

Sometimes it becomes necessary to remove unnecessary sequence objects by simply passing boolean values to it. To do this, the compress method is used, obtaining in the following example a string and a set of True and False for each of its characters.

from itertools import compress
data = list(compress('DOG', [True, False, True])
print(data)

['D', 'G']

The result is a list that only contains the items previously marked as True. The O was removed, because it was False.

Other iterators

Although there are tools in the itertools library that are not included in any of the previous sections, their use is sometimes also very useful in solving many quite specific problems. They are often useful when paired with other iterators. The following will describe such methods as:

  • chain;
  • chain.from_iterable
  • starmap
  • accumulate;
  • islice;
  • izip;
  • tee;
  • groupby.

chain

The chain function performs a union of lists, as shown in the following example for data1 and data2. The resulting array contains all elements of these sequences.

from itertools import chain
data1 = ['D', 'O', 'G']
data2 = [0, 1, 2, 3, 4]
data = list(chain(data1, data2))
print(data)

['D', 'O', 'G', 0, 1, 2, 3, 4]

chain.from_iterable

Works in the same way as chain. It also does list concatenation. The difference is that there is only one argument, a nested list with the lists to be merged.

from itertools import chain
data = [['D', 'O', 'G'], [0, 1, 2, 3, 4]]
data2 = [0, 1, 2, 3, 4]
data = list(chain.from_iterable(data))
print(data)

['D', 'O', 'G', 0, 1, 2, 3, 4]

starmap

The first argument is the function. The second argument is a list of parameters fed to the function. As an example, we take the standard pow function, which lets you chain numbers to powers.

from itertools import starmap
for i in starmap(pow, [(1, 2), (2, 2), (3, 2)]):
    print(i)

1
4
9

accumulate

This function of the itertools module – accumulate calculates the sum of the previous elements and adds the current one to it. Here is an example:

from itertools import accumulate
data = list(accumulate([1,2,3,4]))
print(data)

[1, 3, 6, 10]

You can see from the code that the first result is equal to the first value. The second is the sum of the previous result with the second value set. And so on.

islice

The islice iterator allows you to limit the number of items to be added to the list by giving the desired number of items as a parameter. This example shows how the count and islice methods work together to create 5 numbers, starting from 0 and working in steps of 2.

from itertools import islice
from itertools import count
for i in islice(count(0, 2), 5):
    print(i)

0
2
4
6
8

zip_longest

The zip_longest function is required when it is necessary to pair individual elements of a sequence. The fillvalue parameter allows you to designate the object that will be used to fill the missing cells in the list.

from itertools import zip_longest
for i in zip_longest('DOG', [0, 1, 2, 3], fillvalue = ' '):
    print (i)

('D', 0)
('O', 1)
('G', 2)
(' ', 3)

tee

The tee method is used to generate its own iterators based on the iterated sequence of objects. This example shows the creation of iterators i1 and i2.

from itertools import tee
data = 'DOG'
i1, i2 = tee(data)
for i in i1:
    print(i)
for i in i2:
    print(i)

D
O
G
D
O
G

groupby

The last function in this section is called groupby and is used to group list objects by common values. The above code shows the formatted output of the array data. As you can see in the example, the itertools groupby method takes the list itself as its first argument, while the lambda function takes the second argument.

from itertools import groupby
animals = [('CAT', 'TOM'), ('MOUSE', 'JARRY')]
for key, group in groupby(animals, lambda kind: kind[0]):
    for kind, name in group:
        print('{name} is a {kind}'.format(name = name, kind = kind))

TOM is a CAT
JARRY is a MOUSE

Synopsis at

So, in this article all the methods included in itertools have been described. The following table shows a brief summary of all the functions that have been passed, including their call features and purpose.

Name Purpose
count Iteration with a given step without limits
cycle Iteration with repetition without limitation
repeat Iterate with repeating a specified number of times
combinations Combinations of all possible values without repeating
combinations_with_replacement Combination of all possible values with repeating elements
permutations Combinations with permutations of all possible values
product Combination of all possible values in nested lists
filterfalse All items for which the function returns false
dropwhile All elements beginning with the one for which the function returns false
takewhile All elements, until the function returns true
compress Deletion of the elements which were returned false
chain Uniting lists one by one with the help of iterators
chain.from_terable Similar to chain, but the argument is the list into which the lists to be merged are nested.
islice Getting a slice, due to the specified number of elements
zip_longest Merge several iterations, increasing the size to the maximum
tee Create a tuple from multiple ready iterators
groupby Grouping of sequence elements by some key values
accumulate Each element of the resultant sequence is equal to the sum of the current and all previous elements of the original sequence
starmap In the given function passes the list of substituted arguments

Conclusion

The itertools library contains a lot of useful methods. They help you generate lists and any other sequence of values with specific conditions. You can use it to iterate through sets of numbers, combine strings, and filter an array by attributes. This article describes all the methods of the itertools library contained in the official Python 3 documentation. For clarity, detailed examples of how to use the functions are given.

Related Posts

LEAVE A COMMENT