Task:
Develop a clean_list (list_to_clean) function,
which takes 1 argument - a list of any values (strings, integers, and floats) of any length,
and returns a list that has the same values but does not have duplicate items. This means that if there is a value in the original list in several instances, the first "instance" of the value remains in place, and the second, third, and so on are deleted.
Example:
Function call: clean_list ([32, 32.1, 32.0, -32, 32, '32'])
Returns: [32, 32.1, 32.0, -32, '32']
My code:
def clean_list(list_to_clean):
no_dubl_lst = [value for _, value in set((type(x), x) for x in list_to_clean)]
return no_dubl_lst
print(clean_list([32, 32.1, 32.0, -32, 32, '32']))
Result:
[32.1, 32, -32, 32.0, '32']
But how i can restore original order?
There are two concerns here, so for the purpose of an answer, I'll list both.
Removing duplicates in lists suggests constructing an intermediate set as the fastest method. An element is considered to be present in a set if it's equal to a present element.
In your case, you need not just the value, but also the type to be equal.
So why not construct an intermediate set of tuples (value, type)?
unique_list = [v for v,t in {(v,type(v)) for v in orig_list}]
Use an "ordered set" container as per Does Python have an ordered set?. E.g.:
since 3.7 (and CPython 3.6 where this was an implementation detail), regular dicts preserve insertion order:
unique_list = [v for v,t in dict.fromkeys((v,type(v)) for v in orig_list)]
for all versions (present in 3.6+, too, because it has additional methods), use collections.OrderedDict:
import collections
unique_list = [v for v,t in collections.OrderedDict.fromkeys((v,type(v)) for v in orig_list)]
For the reference, timeit results on my machine (3.7.4 win64) in comparison to other answers as of this writing:
In [24]: l=[random.choice((int,float,lambda v:str(int(v))))(random.random()*1000) for _ in range(100000)]
In [26]: timeit dict_fromkeys(l) #mine
38.6 ms ± 179 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [34]: timeit ordereddict_fromkeys(l) #mine with OrderedDict
53.3 ms ± 233 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [25]: timeit build_with_filter(l) #Ch3steR's O(n)
48.7 ms ± 214 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [28]: timeit dict_with_none(l) #Patrick Artner's
46.8 ms ± 377 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [30]: timeit listcompr_side_effect(l) #CDJB's
55.5 ms ± 801 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With