Definition
A generator
function is defined like a normal function, but whenever it needs to generate a value, it does so with the yield
keyword rather than return
. If the body of a def
contains yield, the function automatically becomes a generator function
(even if it also contains a return statement). There’s nothing else we need to do to create one.
Just remember that a generator
is a special type of iterator
.
In Python, “functions” with these capabilities are called generators
, and they’re incredibly useful. generators
(and the yield
statement) were initially introduced to give programmers a more straightforward way to write code responsible
for producing a series of values. Previously, creating something like a random number generator required a class or module that
both generated values and kept track of state between calls. With the introduction of generators
, this became much simpler.
To better understand the problem generators
solve, let’s take a look at an example. Throughout the example, keep in
mind the core problem being solved: generating a series of values.
Outside of Python, all but the simplest generators
would be referred to as coroutines
.
Stop criteria
- If a
generator
function calls return - If the generator is exhausted by calling
next()
. After that if we callnext()
again it will result in an error.
Examples
Basic function
fibSeries = [0,1]
def fib1(n):
for i in range(2,n):
fibSeries.append(fibSeries[i-1]+fibSeries[i-2])
return fibSeries[n-1]
The complexity is O(N).
Recursive function
def fib2(n):
if n==1:
return 0
elif n==2:
return 1
else:
return fib(n-1)+fib(n-2)
The issue with this implementation is that each function calls two functions and that become exponetial. Think of fib(20). It will call fib(18) and fib(19). fib(19) will call fib(18) and fib(17). So fib(18) gets called 2 times, fib(17) 3 times, etc. And the count goes up exponetially. The same code as above with a counter. fib(20) causes the function to be called over 13K times. Try fib(40)… the code wouldn’t even complete.
Generators
def temp():
a,b = 0,1
yield a
yield b
while True:
a, b = b, a + b
yield b
def fib3(n):
res = []
for i,f in enumerate(temp()):
if i > 10: break
res.append(f)
return res[-1]
Running time
%timeit fib1(10)
%timeit fib2(10)
%timeit fib3(10)
Output:
100000 loops, best of 3: 2.24 µs per loop
10000 loops, best of 3: 26.3 µs per loop
100000 loops, best of 3: 3.31 µs per loop
So the basic fuction is the best, generator function comes next and recursive function is the worst.
This fibonacci is not a good problem for generator
function. I will provide another fancier example. This example is shameless stolen from this blog.
import random
def get_data():
"""Return 3 random integers between 0 and 9"""
return random.sample(range(10), 3)
def consume():
"""Displays a running average across lists of integers sent to it"""
running_sum = 0
data_items_seen = 0
while True:
data = yield
data_items_seen += len(data)
running_sum += sum(data)
print('The running average is {}'.format(running_sum / float(data_items_seen)))
def produce(consumer):
"""Produces a set of values and forwards them to the pre-defined consumer
function"""
while True:
data = get_data()
print('Produced {}'.format(data))
consumer.send(data)
yield
if __name__ == '__main__':
consumer = consume()
consumer.send(None)
producer = produce(consumer)
for _ in range(10):
print('Producing...')
next(producer)
Tips
-
generators
are used to generate a series of values -
yield
is like thereturn
ofgenerator
functions - The only other thing
yield
does is save the “state” of agenerator
function - A
generator
is just a special type of iterator - Like iterators, we can get the next value from a
generator
usingnext()
- It is common to use
while True:
insidegenerator
function.