Keyword yield in Python

by Alex
Keyword yield in Python

Python provides the programmer with a large set of tools, one of which is yield. It replaces the usual return of values from a function and saves memory when processing large amounts of data. yield is one of those tools that you don’t have to use at all. Everything you can implement with it can be done with the usual return. But this operator allows you not only to save memory, but also to realize interaction between several sequences within a single cycle.

What is yield and how it works

Yield is a keyword that is used instead of return. It allows a function to return a value without destroying local variables, and on each subsequent call the function starts its execution with the yield operator. A function that contains a yield in Python 3 is called a generator. To understand how yield works and why it is used, you need to know what generators, iterators, and iterations are. But before that, let’s look at an example:

def numbers_range(n):
    for i in range(n):
        yield i
a = numbers_range(4)
print(type(a))
for b in a:
    print(b)
# This will print to the console:
<class 'generator'>
0
1
2
3

The type of value obtained when calling the function is a generator. One way to get values from the generator is to loop through them. This is what we used. But it is easy to reorder it into a list, as we did in the article about Fibonacci numbers. Now let us see how all this works.

What are iterations?

In programming, iteration is a process in which a set of instructions is repeated a certain number of times in sequence, or until a condition is met. A loop is a repeating sequence of instructions; each loop consists of iterations. That is, one execution of a loop is an iteration. For example, if the loop body has been executed 5 times, it means that 5 iterations have taken place. An iterator is an object that allows you to “traverse” sequence elements. A programmer can create his own iterator, but it is not necessary; the Python interpreter does this itself.

What are generators

A generator is a regular function that returns an object each time it is called. The generator function calls next. The difference between a generator and a normal function is that a function returns only one value with the return keyword, while a generator returns a new object each time it is called with yield. A generator essentially behaves like an iterator, which allows it to be used in a for loop. A programmer may not use generators, but in some situations it is possible to optimize a program only with their help. In addition to yield, there are other ways to create generators; they are described in this article about list generators.

Function next()

This function allows you to extract the next object from the iterator. That is, to loop from the current iteration to the next iteration, the next() function is called. When the iterator runs out of items, it returns the default value or throws a StopItered exception. In fact, every object has a built-in __next__ method, which provides loop traversal, and the next() function just calls it. The function has a simple syntax: next(iterator[,default value]). It is automatically called by the Python interpreter in while and for loops. Here is an example of using next:

def numbers_range(n):
    for i in range(n):
        yield i
a = numbers_range(4)
print(next(a))
print(next(a))
print(next(a))
print(next(a))
# will be printed
0
1
2
3

The advantages of using yield

yields are not used because it’s a Python syntax definition, and anything that can be implemented with yield can also be implemented with simple return.

Programmers prefer to use generators when there is no need to store the entire sequence and intermediate values in memory.

A function that processes a large sequence and uses the normal return requires the interpreter to allocate a lot of memory for it. And while normally such functions do not affect the program performance much, in projects containing sequences with millions of elements, they consume a lot of memory. Using yield in Python 3 allows you not to store the whole sequence in memory, but simply generates an object each time the function is called. This avoids using a large amount of memory.

Comparing the performance of return and yield

Often yield is used when you need to read a large text file. To clearly show the advantage of using generators, we need to create two scripts:

  • The first one uses normal return, it reads all the lines of the file and puts them in a list, then outputs all the lines in the console.
  • The second uses yield, it reads one line at a time and returns it to the output.

The scripts must then process several files of different sizes, with the following results:

File size return yield
Memory Time Memory Time
4 Kbyte 5,3 Mbyte 0.023 с 5.42 Mbyte 0.08 c
324 Kbyte 9.98 Mbyte 0.028 с 5.37 Mbyte 0,32 с
26 Mbyte 392 Mbyte 27 с 5.52 Mbytes 29.61 с
263 Mbytes 3.65 Gbytes 273.56 с 5.55 Mbytes 292,99 с

It can be seen that in both cases the time increases at about the same rate, while the amount of memory consumed is very different. The larger the file is parsed, the more noticeable the difference.

yield from

Many people think that yield from was added to Python 3 to combine the two constructs: yield and for loop, because they are often used together, as in the following example:

# Conventional yield
def numbers_range(n):
    for i in range(n):
        yield i
#yield from
def numbers_range(n):
    yield from range(n)

However, the true purpose of the innovation is a bit different. The construction allows you to “embed” one generator in another, that is, create subgenerators. the yield from allows the programmer to easily manage several generators at once, customize their interaction and, of course, replace the longer for+yield construct, for example:

def subgenerator():
    yield 'World'
def generator():
    yield 'Hello'
    yield from subgenerator() #request a value from the subgenerator
    yield '!
for i in generator():
    print(i, end = ' ')

# output
Hello World !

As you can see from the example, yield from allows one generator to get values from another one. This tool greatly simplifies the life of a programmer, especially for asynchronous programming.

Conclusion

Using the generators in the right places can significantly reduce memory consumption, moreover, the interaction with the generators is more transparent and easier to debug. the yield is just one of many useful Python language features that can be easily replaced by the usual return from a function with return. It has been added to the language to optimize program performance, simplify code and debugging, and give programmers the ability to apply unusual solutions to specialized projects.

Related Posts

LEAVE A COMMENT