I want to find the fastest way to compute the average of python lists. I have millions of lists stored in a dictionary, so I am looking for the most efficient way in terms for performance.
Referring to this question,
If l is a list of float numbers, I have
numpy.mean(l)sum(l) / float(len(l))reduce(lambda x, y: x + y, l) / len(l)Which way would be the fastest?
As @DeepSpace has suggested, you should try yourself to answer this question. You might also consider transforming your list into an array before using numpy.mean. Use %timeit with ipython as follows:
In [1]: import random
In [2]: import numpy
In [3]: from functools import reduce
In [4]: l = random.sample(range(0, 100), 50) # generates a random list of 50 elements
numpy.mean without converting to an np.arrayIn [5]: %timeit numpy.mean(l)
32.5 µs ± 2.82 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
numpy.mean converting to an np.arrayIn [5]: a = numpy.array(a)
In [6]: %timeit numpy.mean(a)
17.6 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
sum(l) / float(len(l))In [5]: %timeit sum(l) / float(len(l)) # not required casting (float) in Python 3
774 ns ± 20.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
sum(l) / len(l)In [5]: %timeit sum(l) / len(l)
623 ns ± 27.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
reduceIn [6]: reduce(lambda x, y: x + y, l) / len(l)
5.92 µs ± 514 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
From slowest to fastest:
numpy.mean(l) without converting to arraynumpy.mean(a)after converting list to np.arrayreduce(lambda x, y: x + y, l) / len(l)sum(l) / float(len(l)), this applies for Python 2 and 3sum(l) / len(l) # For Python 3, you don't need to cast (use float)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With