Removing the first duplicate row and keep the rest?

Question

I want to remove duplicates based on column 'User' but only the first instance where it appears.

DF:

Result: (A1 and B1 removed)

User  No  
A     2
A     3
A     4
C     1
B     2
D     1

I've been unsuccessful with using the duplicated function.

Any help would be appreciated! Thanks!

MrFlick · Accepted Answer

If i understand correctly, this should work

library(dplyr)
dd %>% group_by(User) %>% filter(duplicated(User) | n()==1)

akrun · Answer

Here is an option using data.table. We convert the 'data.frame' to 'data.table' (setDT(DF)). Grouped by the 'User' column, we select all the rows except the first (tail(.SD, -1)) where .SD is Subset of Data.table. But, this will also remove the row if there is only a single row for a 'User' group. We can avoid that by using an if/else condition stating that if the number of rows are greater than 1 (.N>1), we remove the first row or else return the row (.SD).

library(data.table)
setDT(DF)[, if(.N>1) tail(.SD,-1) else .SD , by = User]
#   User No
#1:    A  2
#2:    A  3
#3:    A  4
#4:    B  2
#5:    C  1
#6:    D  1

Or a similar option as in @MrFlick's dplyr code would be using a logical condition with duplicated and .N (number of rows). We create a column 'N' by checking 'User' groups that have a single observation (.N==1), in the next step, we subset the rows that have are either TRUE for N or is duplicated for the 'User'. The duplicated returns TRUE values for duplicate rows leaving the first value as FALSE.

setDT(DF)[DF[, N:=.N==1, by = User][, N|duplicated(User)]][,N:=NULL][]

Or a base R option would be using ave to get the logical index ('indx2') by checking if the length for each 'User' group is 1 or not. We can use this along with the duplicated as described above to subset the dataset.

indx2 <- with(DF, ave(seq_along(User), User, FUN=length)==1)
DF[duplicated(DF$User)|indx2,]
#   User No
#3    A  2
#4    A  3
#5    A  4
#6    C  1
#7    B  2
#8    D  1

Removing the first duplicate row and keep the rest?

Tags:

r

ant

2 Answers

MrFlick

akrun

Recent Activity

Donate For Us

Removing the first duplicate row and keep the rest?

Tags:

r

ant

2 Answers

MrFlick

akrun

Related questions

Recent Activity

Donate For Us