How to find group-column have duplicate values in a dataframegroup python?

Question

first i have a df, when i groupby it with a column, will it remove duplicate values?. Second, how to know which group have duplicate values ( i tried to find how to know which columns of a df have duplicate values but couldn't find anything, they just talk about how each element duplicated or not)

ex i have a df like this:
     A    B   C
1    1    2   3
2    1    4   3
3    2    2   2
4    2    3   4
5    2    2   3

after groupby('A')

A    B       C
1    2       3
     4       3
2    2       2
     3       2
     2       3

i want to know how many group A have B duplicated, and how many group A have C duplicated

result:
   B    C
1  1    2

or maybe better can caculate percent

B : 50%
C : 100%

thanks

Ch3steR · Accepted Answer

You could use a lambda function inside GroupBy.agg to compare number of unique values that is not equal to the number of values in a group. To get the number of unique we can use Series.nunique and Series.size for the number of values in a group.

df.groupby(level=0).agg(lambda x: x.size!=x.nunique())

#        B      C
# 1  False   True
# 2   True  False

BENY · Answer

Let us try

out = df.groupby(level=0).agg(lambda x : x.duplicated().any())
       B      C
1  False   True
2   True  False

How to find group-column have duplicate values in a dataframegroup python?

Tags:

python

pandas

duplicates

pandas-groupby

robocon20x

2 Answers

Ch3steR

BENY

Recent Activity

Donate For Us

How to find group-column have duplicate values in a dataframegroup python?

Tags:

python

pandas

duplicates

pandas-groupby

robocon20x

2 Answers

Ch3steR

BENY

Related questions

Recent Activity

Donate For Us