Python – How to make a flat list out of a list of lists

flattenlistmultidimensional-arraypython

Is there a shortcut to make a simple list out of a list of lists in Python?

I can do it in a for loop, but is there some cool "one-liner"?

I tried it with functools.reduce():

from functools import reduce
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
reduce(lambda x, y: x.extend(y), l)

But I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'

Best Answer

Given a list of lists t,

flat_list = [item for sublist in t for item in sublist]

which means:

flat_list = []
for sublist in t:
    for item in sublist:
        flat_list.append(item)

is faster than the shortcuts posted so far. (t is the list to flatten.)

Here is the corresponding function:

def flatten(t):
    return [item for sublist in t for item in sublist]

As evidence, you can use the timeit module in the standard library:

$ python -mtimeit -s't=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in t for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s't=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(t, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s't=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,t)'
1000 loops, best of 3: 1.1 msec per loop

Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(T**2) when there are T sublists -- as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have T sublists of k items each: the first k items are copied back and forth T-1 times, the second k items T-2 times, and so on; total number of copies is k times the sum of x for x from 1 to T excluded, i.e., k * (T**2)/2.

The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.