Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python list elements vs converting to tuple for string formatting

I came across a problem on codewars and am not sure what the difference is between these two possible solutions, one converting a list into a tuple and one specifying elements of the input list.

Problem: convert a list of names (strings) to a statement similar to what Facebook uses to display likes: "Alex likes this", "Alex and John like this", "Alex, John and 2 others like this", etc.

Using a if-elif-etc statement, this is pretty trivial:

    if len(names) == 0:
        output_string = "no one likes this"
    elif len(names) == 1:
        output_string = str(names[0]) + " likes this"

But in the longer lists of names, you have a choice:

    elif len(names) == 2:
        output_string = "%s and %s like this" % (names[0], names[1])

OR

    elif len(names) == 3:
        output_string = "%s, %s and %s like this" % tuple(names)

My hypothesis is that it's more computationally efficient to use names[0] etc, because you don't create a new object in memory for the tuple - is that right?

like image 847
trish_s Avatar asked Jun 27 '26 04:06

trish_s


1 Answers

CPython optimization rules are usually based around how much work you push to the C layer (vs. the bytecode interpreter) and how complex the bytecode instructions are; for low levels of absolute work, the fixed overhead of the interpreter tends to swamp the real work, so intuition derived from experience in lower-level languages just doesn't apply.

It's pretty easy to test though, especially with ipython's %timeit magic (timings done on Python 3.8.5 on Alpine Linux running under WSLv2):

In [2]: %%timeit l = [1, 2, 3]
   ...: tuple(l)
97.6 ns ± 0.303 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [3]: %%timeit l = [1, 2, 3]
   ...: (l[0], l[1], l[2])
104 ns ± 0.561 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %%timeit l = [1, 2, 3]
   ...: (*l,)
78.1 ns ± 0.628 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [5]: %%timeit l = [1, 2]
   ...: tuple(l)
96 ns ± 0.895 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %%timeit l = [1, 2]
   ...: (l[0], l[1])
70.1 ns ± 0.571 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [7]: %%timeit l = [1, 2]
   ...: (*l,)
73.4 ns ± 0.736 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

So in fact, the code example you gave made the correct decision for each size (assuming performance is all that counts); at two elements, indexing is faster than the alternatives, at three, converting to tuple in bulk saves enough over repeated indexing to win.

Just for fun, I included an equivalent solution to tuple(l) up there that using the additional unpacking generalizations to build the tuple using dedicated bytecodes, which shows how something as small as replacing a generalized constructor call with dedicated optimized bytecode can make a surprisingly amount of difference in the fixed overhead.

What's extra fun about this example: The faster (*l,) solution actually involves two temporaries; BUILD_TUPLE_UNPACK (the byte code that implements it) shares a code path with BUILD_LIST_UNPACK. Both of them actually build a list, and BUILD_TUPLE_UNPACK just converts it to tuple at the end. So (*l,) is hiding yet another copy to a temporary data structure, but because the specialized bytecode is so much more efficient than built-in lookup plus general purpose constructor code paths, it still wins.

like image 90
ShadowRanger Avatar answered Jun 28 '26 19:06

ShadowRanger



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!