Iterators and Generators
Contents
Iterators and Generators¶
So far, you have seen things like
a = [1,2,3]
for x in a:
print(x)
1
2
3
This looks very different from a C-style for loop where we loop over the variable index:
for (size_t i = 0; i < 3; i++) {
printf("%d\n", i);
}
Or for instance, we can use something called a range
for i in range(3):
print(i)
0
1
2
or other data types
d = {"hello": 1, "goodbye": 2}
for k,v in d.items():
print(k, ' : ', v)
hello : 1
goodbye : 2
The key to using this sort of syntax is the concept of iterator. This is common in object-oriented programming (not just in Python), but you probably haven’t seen iterators before if you’ve only used imperative languages.
An object is iterable if it implements the __iter__
method, which is expected to return an iterator object.
An object is an iterator if it implements the __next__
method, which either
returns the next element of the iterable object
raises the
StopIteration
exception if there are no more elements to iterate over
A Basic Iterator¶
What if we want to replicate range
?
r = range(3)
type(r)
range
we can produce an iterator using the iter
function
ri = iter(r)
type(ri)
range_iterator
we can explicitly run through the iterator using the next
function
next(ri)
0
r = range(1,5,2)
ri = iter(r)
print(next(ri))
print(next(ri))
1
3
class my_range_iterator(object):
def __init__(self, start, stop, stride):
self.state = start
self.stop = stop
self.stride = stride
def __next__(self):
if self.state >= self.stop:
raise StopIteration # signals "the end"
ret = self.state # we'll return current state
self.state += self.stride # increment state
return ret
# an iterable
class my_range(object):
def __init__(self, start, stop, stride=1):
self.start = start
self.stop = stop
self.stride = stride
def __iter__(self):
return my_range_iterator(self.start, self.stop, self.stride)
r = my_range(0,3)
type(r)
__main__.my_range
ri = iter(r)
type(ri)
__main__.my_range_iterator
next(ri)
0
for i in my_range(0,3):
print(i)
0
1
2
for i in range(0,3):
print(i)
0
1
2
You can also create classes that are both iterators and iterables
# an iterable and iterator
class my_range2(object):
def __init__(self, start, stop, stride=1):
self.start = start
self.stop = stop
self.stride = stride
self.state = start
def __iter__(self):
return self
def __next__(self):
if self.state >= self.stop:
raise StopIteration # signals "the end"
ret = self.state # we'll return current state
self.state += self.stride # increment state
return ret
Using Iterators for Computation¶
Let’s now use iterators for something more interesting - computing the Fibonacci numbers.
class FibonacciIterator(object):
def __init__(self):
self.a = 0 # current number
self.b = 1 # next number
def __iter__(self):
return self
def __next__(self):
ret = self.a
self.a, self.b = self.b, self.a + self.b # advance the iteration
return ret
for i in FibonacciIterator():
if i > 1000:
break
print(i)
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
Note that we never raise a StopIteration
exception - the iterator will just keep going if we let it.
Exercise¶
Define FibonacciIterator
so it will iterate over all Fibonacci numbers until they are greater than a parameter n
.
## Your code here
Generators¶
Often, a more elegant way to define an iterator is using a generator
This is a special kind of iterator defined using a function instead of using classes that implement the __iter__
and __next__
methods.
See this post for more discussion.
def my_range3(state, stop, stride=1):
while state < stop:
yield state
state += stride
Note that we use the def
keyword instead of the class
keyword for our declaration. The yield
keyword returns subsequent values of the iteration.
r = my_range3(0,3)
type(r)
generator
ri = iter(r)
type(ri)
generator
next(ri)
0
for i in my_range3(0,3):
print(i)
0
1
2
Our Fibonacci example re-written using a generator:
def FibonacciGenerator():
a = 0
b = 1
while True:
yield a
a, b = b, a + b
for i in FibonacciGenerator():
if i > 1000:
break
print(i)
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
Exercise¶
Define FibonacciGenerator
so it will iterate over all Fibonacci numbers until they are greater than a parameter n
.
## Your code here
def FibonacciGenerator(n):
a = 0
b = 1
while a < n:
yield a
a, b = b, a + b
for i in FibonacciGenerator(1000):
print(i)
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
Iteration tools¶
Some useful tools for iterators that come in handy are:
zip
- iterates over multiple iterators simulataneously
for i, a in zip([0,1,2], ['a', 'b', 'c']):
print(i,a)
0 a
1 b
2 c
reversed
- iterates in reverse order
for i in reversed(range(3)):
print(i)
2
1
0
enumerate
- returns the iteration step count as well as the iterator value
for i, a in enumerate('abc'):
print(i, a)
0 a
1
b
2 c
Exercise¶
Implement your own versions of zip
and enumerate
using generators
## Your code here
def my_zip(a, b):
ai = iter(a)
bi = iter(b)
while True:
try:
yield next(ai), next(bi)
except:
return
for i, a in my_zip(range(3), 'abc'):
print(i,a)
0 a
1 b
2 c
def my_enumerate(a):
ct = 0
for x in a:
yield ct, x
ct = ct + 1
for i, a in my_enumerate('abcd'):
print(i, a)
0 a
1 b
2 c
3 d
The Itertools Package¶
A useful package for dealing with iterators is the itertools package. Here are a few examples - click on the link to see what else the package provides.
product
gives the equivalent of a nested for-loop
from itertools import product
for i, j in product(range(2), range(3)):
print(i, j)
0 0
0 1
0 2
1 0
1 1
1 2
# equivalent to:
for i in range(2):
for j in range(3):
print(i,j)
0 0
0 1
0 2
1 0
1 1
1 2
repeat
just repeats a value
from itertools import repeat
for i in repeat(10, 5):
print(i)
10
10
10
10
10
permutations
from itertools import permutations
for p in permutations(range(3)):
print(p)
(0, 1, 2)
(0, 2, 1)
(1, 0, 2)
(1, 2, 0)
(2, 0, 1)
(2, 1, 0)
Exercise¶
Implement your own version of product
and repeat
using generators.
## Your code here
def my_product(a, b):
for x in a:
for y in b:
yield x, y
for x, y in my_product(range(2), range(3)):
print(x,y)
0 0
0 1
0 2
1 0
1 1
1 2
def my_repeat(n, ct):
for i in range(ct):
yield n
for x in my_repeat(10, 5):
print(x)
10
10
10
10
10
Iterators for Scientific Computing¶
One way you might use an iterator in scientific computing is when implementing an iterative algorithm.
Here is an example of the power method, which finds the largest eigenvalue-eigenvector pair of a matrix.
import numpy as np
def PowerMethodGenerator(A, x):
def RayleighQuotient(A, x):
"""
x^T A x / x^T x
"""
return np.dot(x, A @ x) / np.dot(x,x)
x = x / np.linalg.norm(x)
rq_prev = np.inf
rq = RayleighQuotient(A, x)
while True:
# yield state: RayleighQuotient, x, and difference from previous iteration
yield rq, x, np.abs(rq - rq_prev)
# compute next iteration
x = A @ x
x = x / np.linalg.norm(x)
rq_prev = rq
rq = RayleighQuotient(A, x)
# here's a version that uses the generator in a while-loop
n = 100
A = np.random.randn(n, n)
A = A @ A.T + 5 # constant increases eigenvalue in constant vector direction
x0 = np.random.randn(n)
solver = PowerMethodGenerator(A, x0)
tol = 1e-4
while True:
rq, x, eps = next(solver)
print(rq, eps)
if eps < tol:
break
105.19696872759084 inf
281.63511492463675 176.4381461970459
326.7192591328154 45.08414420817866
352.50704592651266 25.78778679369725
391.7108945621984 39.203848635685745
460.83794809636106 69.12705353416266
540.3664504666252 79.52850237026416
592.0237369196417 51.65728645301647
614.2130807999849 22.189343880343245
622.058528516079 7.8454477160941
624.6490855951763 2.590557079097266
625.4882828792902 0.8391972841138795
625.7592874771177 0.2710045978275275
625.8469290763087 0.08764159919098802
625.8753406319709 0.028411555662160026
625.884573120211 0.009232488240172643
625.8875796627136 0.003006542502589582
625.8885605557036 0.0009808929900145813
625.8888810944816 0.0003205387779416924
625.8889859914814 0.00010489699980098521
625.8890203634668 3.4371985407233296e-05
If we decide that we’re not satisfied with convergence yet, we can resume where we left off
tol = 1e-6
while True:
rq, x, eps = next(solver)
print(rq, eps)
if eps < tol:
break
625.8890316394298 1.1275963061052607e-05
625.8890353425546 3.7031247757113306e-06
625.8890365599013 1.2173467212051037e-06
625.8890369604569 4.005555638286751e-07
You can do the same thing with for-loops
# here's a version that uses the generator in a for-loop
n = 100
A = np.random.randn(n, n)
A = (A @ A.T) + 5 # constant increases eigenvalue in constant vector direction
x0 = np.random.randn(n)
solver = PowerMethodGenerator(A, x0)
tol = 1e-4
for rq, x, eps in solver:
print(rq, eps)
if eps < tol:
break
tol = 1e-6
print('\nresuming iteration after decreasing tolerance\n')
for rq, x, eps in solver:
print(rq, eps)
if eps < tol:
break
146.2065417420594 inf
443.4881333603037 297.2815916182443
600.5673146162095 157.07918125590578
641.394615191388 40.82730057517847
650.083503914267 8.688888722879028
652.0672892607818 1.9837853465147646
652.565176438465 0.49788717768319657
652.6996569481961 0.13448050973113368
652.73788334235 0.03822639415386675
652.7491297140693 0.011246371719380477
652.752516481088 0.003386767018696446
652.7535531163754 0.0010366352873916185
652.753874213889 0.0003210975136198613
652.7539745923442 0.00010037845515853405
652.7540062074261 3.161508186622086e-05
resuming iteration after decreasing tolerance
652.7540162285405 1.0021114462688274e-05
652.754019422857 3.194316491317295e-06
652.7540204462541 1.0233970897388645e-06
652.7540207756615 3.29407384924707e-07