I've been working on a performance critical application which requires frequently requires making copies of a 2D list of integers and modifying the copy (I'm implementing the minimax algorithm).
I've noticed there is a huge difference in performance between a copy, and a deepcopy on lists with the same number of elements, and I'd like to understand if my thinking is correct.
To reproduce my problem, run the following code:
import numpy as np
np.random.seed(0)
lst1 = np.random.randint(100, size=1000 * 1000).tolist()
lst2 = np.random.randint(100, size=(1000, 1000)).tolist()
Now, timing the statements below, you should see timings similar to mine.
%timeit copy.copy(lst1)
%timeit lst1.copy()
%timeit copy.deepcopy(lst2)
5 ms ± 49.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
5.47 ms ± 551 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.61 s ± 112 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Both lst1 and lst2 have a million elements, but reliably copying the former is 200x faster than a nested list with the same number of elements. I thought this would have to do with the fact that making deep copies of nested lists might require some recursive implementation that is slow, so I tried
%timeit copy.deepcopy(lst1)
1.43 s ± 90.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
And the timings still show a massive slowdown. I've checked the docs but not much explanation was offered. However, from the timings, I suspect that deepcopy is copying each int as well, creating new integers. But this seems like a wasteful thing to do.
Am I right in my thinking here? What is deepcopy doing here that list.copy and shallow copy don't?
I've seen deepcopy() is extremely slow but it seems that question is asking for an alternative rather than an explanation (it wasn't clear to me).
deepcopy isn't copying the ints. There's no way it could do that anyway.
deepcopy is slow because it needs to handle the full complexity of a deep copy, even if that turns out to be unnecessary. That includes dispatching to the appropriate copier for every object it finds, even if the copier turns out to basically just be lambda x: x. That includes maintaining a memo dict and keeping track of every object copied, to handle duplicate references to the same objects, even if there are none. That includes special copy handling for data structures like lists and dicts, so it doesn't go into an infinite recursion when trying to copy a data structure with recursive references.
All of that has to be done no matter whether it pays off. All of it is expensive.
Also, deepcopy is pure-Python. That doesn't help. Comparing deepcopy to pickle.loads(pickle.dumps(whatever)), which performs a very similar job, pickle wins handily due to the C implementation. (On Python 2, replace pickle with cPickle.) pickle still loses hard to an implementation that takes advantage of the known structure of the input, though:
In [15]: x = [[0]*1000 for i in range(1000)]
In [16]: %timeit copy.deepcopy(x)
1.05 s ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [17]: %timeit pickle.loads(pickle.dumps(x))
78 ms ± 4.03 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [18]: %timeit [l[:] for l in x]
4.56 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With