Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the mode of a series consisting of list elements in Pandas

I am working with a pd.Series where each entry is a list. I would like to find the mode of the series, that is, the most common list in this series. I have tried using both pandas.Series.value_counts and pandas.Series.mode. However, both of these approaches lead to the following exception being raised:

TypeError: unhashable type: 'list'

Here is a simple example of such a series:

pd.Series([[1,2,3], [4,5,6], [1,2,3]])

I am looking for a function that will return [1,2,3].

like image 827
splinter Avatar asked Dec 14 '25 02:12

splinter


2 Answers

You need to convert to tuple , then using mode

pd.Series([[1,2,3], [4,5,6], [1,2,3]]).apply(tuple).mode().apply(list)
Out[192]: 
0    [1, 2, 3]
dtype: object

Slightly improvement:

list(pd.Series([[1,2,3], [4,5,6], [1,2,3]]).apply(tuple).mode().iloc[0])
Out[210]: [1, 2, 3]

Since two apply is ugly

s=pd.Series([[1,2,3], [4,5,6], [1,2,3]])
s[s.astype(str)==s.astype(str).mode()[0]].iloc[0]
Out[205]: [1, 2, 3]
like image 72
BENY Avatar answered Dec 16 '25 16:12

BENY


Lists are not hashable, so you will need to transform your Series of lists to a Series of tuples.

Once you do that, you can use a Counter to quickly and efficiently generate a multi-set of tuples, and then use Counter.most_common to extract the most common element (AKA, the mode).

s = pd.Series([[1,2,3], [4,5,6], [1,2,3]])

from collections import Counter  

c = Counter(tuple(l) for l in s)
list(c.most_common(1)[0][0])
[1, 2, 3]
like image 20
cs95 Avatar answered Dec 16 '25 18:12

cs95



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!