Understanding the Python list for loops

advertisements

I'm reading the Python wikibook and feel confused about this part:

List comprehension supports more than one for statement. It will evaluate the items in all of the objects sequentially and will loop over the shorter objects if one object is longer than the rest.

>>>item = [x+y for x in 'cat' for y in 'pot']
>>>print item
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

I understand the usage of nested for loops but I don't get

...and will loop over the shorter objects if one object is longer than the rest

What does this mean? (shorter, longer...)


Just try it:

>>> [x+y for x in 'cat' for y in 'potty']
['cp', 'co', 'ct', 'ct', 'cy', 'ap', 'ao', 'at', 'at', 'ay', 'tp', 'to', 'tt', 'tt', 'ty']
>>> [x+y for x in 'catty' for y in 'pot']
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt', 'tp', 'to', 'tt', 'yp', 'yo', 'yt']

The inner 'x' in the list comprehension above (ie, the for x in 'cat' part) the is the same as the outer for x in 'cat': in this example:

>>> li=[]
>>> for x in 'cat':
...    for y in 'pot':
...       li.append(x+y)
# li=['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

So the effect of making one shorter or longer is the same as making the 'x' or 'y' loop longer in two nested loops:

>>> li=[]
>>> for x in 'catty':
...    for y in 'pot':
...       li.append(x+y)
...
>>> li==[x+y for x in 'catty' for y in 'pot']
True

Edit

There seems to be confusion (in the comments) about nested loops versus zip.

Nested Loops:

As shown above, this:

[x+y for x in '12345' for y in 'abc']

is the same as two nested 'for' loops with 'x' the outer loop.

Nested loops will execute the inner y loop the range of x in the outer loop times.

So:

>>> [x+y for x in '12345' for y in 'ab']
    ['1a', '1b',   # '1' in the x loop
     '2a', '2b',   # '2' in the x loop, b in the y loop
     '3a', '3b',   # '3' in the x loop, back to 'a' in the y loop
     '4a', '4b',   # so on
     '5a', '5b']

You can get the same result with product from itertools:

>>> from itertools import product
>>> [x+y for x,y in product('12345','ab')]
['1a', '1b', '2a', '2b', '3a', '3b', '4a', '4b', '5a', '5b']

Zip is similar but stops after the shorter sequence is exhausted:

>>> [x+y for x,y in zip('12345','ab')]
['1a', '2b']
>>> [x+y for x,y in zip('ab', '12345')]
['a1', 'b2']

You can use itertools for a zip that will zip until the longest sequence is exhausted, but the result is different:

>>> import itertools
>>> [x+y for x,y in itertools.zip_longest('12345','ab',fillvalue='*')]
['1a', '2b', '3*', '4*', '5*']