Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate columns in a list of dataframes r

I have a List with many dataframes. Each dataframe contains duplicate columns. I would like to return only the unique columns in each dataframe. I have tried several codes including below, but continue to get errors. The code I'm presently using is below and a description of the first dataframe in my List is listed as well. I appreciate any help.

x  <- lapply(dataFiles, function(x){
  for(i in 1:length(colnames(dataFiles)))
  dataFiles[[!duplicated(dataFiles[[i]])]]
}
)



str(dataFiles[[1]])
'data.frame':   20381 obs. of  10 variables:
 $ FILEID    : chr  "ACSSF" "ACSSF" "ACSSF" "ACSSF" ...
 $ FILETYPE  : num  2.01e+08 2.01e+08 2.01e+08 2.01e+08 2.01e+08 ...
 $ STUSAB    : chr  "ny" "ny" "ny" "ny" ...
 $ CHARITER  : int  0 0 0 0 0 0 0 0 0 0 ...
 $ SEQUENCE  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ LOGRECNO  : int  3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 ...
 $ B00001_001: int  212 215 278 246 235 NA 225 522 213 262 ...
 $ B00002_001: int  108 124 126 105 122 NA 108 105 104 140 ...
 $ LOGRECNO  : int  3391 3392 3393 3394 3395 3396 3397 3398 3399 3400 ...
 $ GEOID     : chr  "14000US36001000100" "14000US36001000200" "14000US36001000300" "14000US36001000401" ...
like image 556
user3067851 Avatar asked Dec 10 '25 00:12

user3067851


1 Answers

Here is a simple example:

tmp <- data.frame(seq(10), seq(10), rnorm(10))
colnames(tmp) <- c("A","A","B")

l <- list(tmp, tmp)

lapply(l, function(x) x[,!duplicated(colnames(x))])

or as noted by @agstudy you could use unique

lapply(l, function(x) x[,unique(colnames(x))])
like image 157
cdeterman Avatar answered Dec 12 '25 16:12

cdeterman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!