df
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "a" "d"
[4,] "b" "a"
[5,] "b" "c"
[6,] "b" "d"
[7,] "c" "a"
[8,] "c" "b"
[9,] "c" "d"
Let's assume have a data.frame like this and I want to remove duplicates in sense of across the column
df1
[,1] [,2]
[1,] "a" "b"
[2,] "a" "c"
[3,] "a" "d"
[5,] "b" "c"
[6,] "b" "d"
[9,] "c" "d"
I want to end up like this.
We can sort the elements in each row with apply, transpose the output, apply duplicated to return a logical vector and use that for subsetting the rows
df[!duplicated(t(apply(df[, 1:2], 1, sort))),]
# [,1] [,2]
#[1,] "a" "b"
#[2,] "a" "c"
#[3,] "a" "d"
#[4,] "b" "c"
#[5,] "b" "d"
#[6,] "c" "d"
or another option is pmin/pmax
df[!duplicated(cbind(pmin(df[,1], df[,2]), pmax(df[,1], df[,2]))),]
df <- structure(c("a", "a", "a", "b", "b", "b", "c", "c", "c", "b",
"c", "d", "a", "c", "d", "a", "b", "d"), .Dim = c(9L, 2L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With