Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore case while using duplicated

I am using the duplicated function in R to remove the duplicate rows in my data frame.

 df:

 Name Rank
  A    1
  a    1
  B    2


df[!duplicated(df),]

 Name Rank
  A    1
  a    1
  B    2

The second row is same as the first, but doesn't get deleted just because it takes the case of the "A" and "a" in to consideration. What is the turn around this? Thanks.

like image 967
Jain Avatar asked Sep 13 '25 04:09

Jain


1 Answers

# If it's okay to change the case
df.lower      <- df
df.lower$Name <- tolower(df$Name)

df.lower[!duplicated(df.lower$Name),]

# If you don't want to change the case
df[!duplicated(df.lower$Name),]

or simply

df[!duplicated(tolower(df$Name)),]
  Name Rank
1    A    1
3    B    2

That's for deduping based on Name. For the entire row you could do:

df.lower[!duplicated(df.lower),] # changes the case

or

df[!duplicated(cbind(tolower(df$Name),df$Rank)),] # does not change case
like image 199
Hack-R Avatar answered Sep 15 '25 18:09

Hack-R