Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way to get count of distinct values in a pandas dataframe?

I have a dataframe as shown below.

    0   1   2
0   A   B   C
1   B   C   B
2   B   D   E
3   C   E   E
4   B   F   A

I need to get count of unique values from the entire dataframe, not column-wise unique values. In the above dataframe, unique values are A, B, C, D, E, F. So, the result I need is 6.

I'm achieving this using pandas squeeze, ravel and nunique functions, which converts entire dataframe into a series.

pd.Series(df.squeeze().values.ravel()).nunique(dropna=True)

Please let me know if there is any better way to achieve this.

like image 866
ds_Abc Avatar asked Oct 22 '25 22:10

ds_Abc


1 Answers

Use numpy.unique with length of unique values:

out = len(np.unique(df))
6
like image 146
jezrael Avatar answered Oct 25 '25 11:10

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!