Wednesday, June 12, 2013

use generators, yield, xrange and obj.iter*() whenever possible

Python has a very cool feature. Java also has it, but Java isn't cool. Probably C and its variants have it, and I'm sure that super cool Ruby has it too. What about MATLAB, anyone? Bueller? ... Bueller?

iterators

An iterator uses less memory and is generally faster than an iterable.

Generators

Instead of returning a list from a function, turn it into an iterator by using yield

iterable

def listy(x=5):
    return range(x)

iterator

def genny(x=5):
    idx = 0
    while idx < 5:
        yield idx
        idx += 1

xrange

The example above is the same as the difference between range and xrange.

iterable

>>> listy = range(5):
>>> print listy
[0, 1, 2, 3, 4]

iterator

>>> for idx in xrange(5):
...     print idx,
0, 1, 2, 3, 4:

Generator Expression

Just like you can make a list on the fly with a list comprehension, you can make a generator on the fly with a generator expression!

iterable

>>> listy = [idx for idx in range(5)]:
>>> print listy
[0, 1, 2, 3, 4]

iterator

>>> animals = {'dog': 'spot', 'cat': 'felix'}
>>> genny = ('%s is a %s' % (name, animal) for animal, name in animals.iteritems())
>>> for animal_name in genny:
...     print animal_name,
spot is a dog felix is a cat

obj.iter*()

This actually links to a great section in the tutorial on looping techniques, where you will see examples of xrange(), enumerate(), and iteritems(). You can see in the generator comprehension above that dictionary has a method called iteritems() that produces an iterator (generator in Pythonish) instead of an iterable (like a Python list).

iterable

>>> animals = {'dog': 'spot', 'cat': 'felix'}
>>> listy = animals.items()
>>> print listy
[('dog', 'spot'), ('cat', 'felix')]

iterator

>>> animals = {'dog': 'spot', 'cat': 'felix'}
>>> genny = animals.iteritems())
>>> for animal in genny:
...     print animal,
('dog', 'spot') ('cat', 'felix')

Last Word

Just like in Java, an iterator is a one use container. Each time its next() method is called it advances to the next item until it reaches the end and then its raises a StopIteration exception. When the loop catches the exception, it exits gracefully. In order to reuse it you would have to create a new generator, but if you find that you need to use it multiple times, then you are better off using an iterable instead. In Summary ...
  • iterator: one time use, faster and uses less memory, good when only need to iterate through items one time
  • iterable: slower, uses more memory, good if you need to use any list methods or if you need to iterate through the same items many times.

No comments:

Post a Comment

Fork me on GitHub