I have a Python list of lists:
l = [[1, 2, 3], [4], [5, 6], [7, 8, 9, 10]]
What I want is to repeat the first element of each list based on the length of the list:
result = [1, 1, 1, 4, 5, 5, 7, 7, 7, 7]
I can achieve this using list comprehension but my list is very long and so the method is slow:
result = [[x[0]]*len(x) for x in l]
[[1, 1, 1], [4], [5, 5], [7, 7, 7, 7]]
Although, this still returns a list of lists rather than a flat list. So, I am trying to figure out the fastest method for creating that flat list based on the criteria mentioned above.
Update: I want the fastest performing method since the list is long
Using itertools.repeat
with chain
is the most efficient using python 2:
In [13]: l = [choice(l) for _ in xrange(1000000)]
In [14]: timeit list(itertools.chain(*[[i[0]]*len(i) for i in l]))
1 loops, best of 3: 416 ms per loop
In [15]: timeit [i[0] for i in l for _ in xrange(len(i))]
1 loops, best of 3: 245 ms per loop
In [16]: timeit list(itertools.chain.from_iterable(repeat(i[0],len(i)) for i in l))
1 loops, best of 3: 223 ms per loop
In [17]: timeit [i for x in l for i in [x[0]]*len(x)]
1 loops, best of 3: 332 ms per loop
Interestingly using python3
, using a list instead of a generator expression is faster:
In [8]: timeit list(chain.from_iterable(repeat(i[0], len(i)) for i in l))
1 loops, best of 3: 372 ms per loop
In [9]: timeit [i[0] for i in l for _ in range(len(i))]
1 loops, best of 3: 433 ms per loop
In [10]: timeit list(chain.from_iterable([repeat(i[0],len(i)) for i in l]))
1 loops, best of 3: 296 ms per loop
In [11]: timeit list(chain(*[[i[0]]*len(i) for i in l]))
1 loops, best of 3: 460 ms per loop
In [12]: timeit [i for x in l for i in [x[0]]*len(x)]
1 loops, best of 3: 348 ms per loop
If you want a compromise between time and space then iterate over the chain object getting an element at a time:
In [18]: %%timeit
for ele in chain.from_iterable([repeat(i[0],len(i)) for i in l]):
pass
....:
1 loops, best of 3: 306 ms per lo