Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identifying duplicates in a list of character vectors in R

Tags:

r

duplicates

I have a list of character vectors like this:

my_list <- list(c('a','b','c','d','e'),c('e','f','g'),c('h','i','j'))
names(my_list) <- c("group1","group2","group3")

And I want to have a simple way to test my_list for duplicates in the letters across any of the 3 groups/vectors in my list. So for instance, "e" appears in both group 1 and group 2 so that would be a duplicate. Anything simple that just returns a logical if there is at least one or more duplicates across 2 or more groups would be ideal. So a FALSE return would mean that the letters in each group are unique to that group only (this isn't the case in my example here obviously).

Thanks so much!

like image 870
beanboy Avatar asked Nov 01 '25 21:11

beanboy


1 Answers

A binary output can be generated with

any(duplicated(unlist(my_list)))
[1] TRUE

As pointed out correctly in comments by @sindri_baldur, if duplicates appear in groups they should be handled with unique, if desired:

any(duplicated(unlist(lapply(my_list, unique))))
[1] TRUE

or another base R alternative

anyDuplicated(unlist(lapply(my_list, unique))) > 1
[1] TRUE
like image 184
Andre Wildberg Avatar answered Nov 03 '25 11:11

Andre Wildberg