Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Big-O of del my_list[:]?

Tags:

python

What is the big O of del my_list[:]? This command deletes all elements in the list. My understanding is that it will be O(n). n being the length of the list.

Therefore the big O of this code would be bigO(n^2), correct?

Note this is not for school, but rather for my understanding while I practice for interviews.

from copy import deepcopy
class Solution:
    # @param A : list of integers
    # @return an integer
    def removeDuplicates(self, A):
        copy_array = deepcopy(A)
        del A[:]

        for j in copy_array:
            if j in A:
                pass
            else:
                A.append(j)

        return len(A)
like image 821
data_pi Avatar asked Feb 02 '26 15:02

data_pi


2 Answers

The del doesn't impact big-O here, the loop is order n and the j in A test is order n, so the nested loop is O(n**2); the del is O(n), but it's not part of the loop, and since it's a lower order of work, it's ignored.

Side-note: A O(n) solution for this would be to use collections.OrderedDict to dedup, preserving order, making the body of the method just:

A[:] = collections.OrderedDict.fromkeys(A)
return len(A)
like image 53
ShadowRanger Avatar answered Feb 05 '26 06:02

ShadowRanger


as noted in the comment and other answers, in your case it doesn't matter whether the deletion is O(n), because this is a once-only operation, and your loop is already O(n^2).

Still, your question about del A[:] is interesting and worth addressing:

According to the Python wiki's page on time complexity, deleting a slice from a list is indeed O(n).

However, since lists are represented internally as arrays, and operations on [:] are essentially reassigning the entire array, and the old array can be garbage-collected at some later point, I think it's probably possible that this case can actually be optimized in the implementation so that the del statement itself is O(1), and the actual cleanup of the old array is delayed The algorithmic cost may still be O(n), but this could nevertheless have the advantage of deferring some of the work to remove it from a computational "hot spot" in your program.

Per the first comment below, though, the main implementation of Python, CPython, uses reference-counting; this means that del must actually decrement the reference count for each item contained in the list, which is O(n).

PyPy, however, has configurable garbage collection, and all of the supported collectors seem to be based on a generational-copying scheme, which does permit the optimization. Moreover, in a copying scheme, the memory containing the old objects can typically just be ignored rather than properly de-allocated, so the actual deferred cost might actually be free (in the sense that the del statement makes the next generational-copy cheaper, since the original array no longer needs to be copied). The data allocated for the "ignored" objects may still be cleared (and indeed the PyPy link indicates that its generational GC does do this), but since the entire old memory space is cleared, I am not sure that it matters how much of this space is actually populated.

NOTE, however, that objects that require special cleanup operations, i.e. objects that implement __del__, are a special case and cannot simply be "ignored" during the generational copy, so de-allocating an array of objects with __del__ must always be O(n). The linked page has a few details on how these objects are handled.

like image 24
Kyle Strand Avatar answered Feb 05 '26 06:02

Kyle Strand