I produced a large data frame (1700+obs,159 variables) with a function that collects info from a website. Usually, the function finds numeric values for some columns, and thus they're numeric. Sometimes, however, it finds some text, and converts the whole column to text.
I have one df whose column classes are correct, and I would like to "paste" those classes to a new, incorrect df.
Say, for example:
dfCorrect<-data.frame(x=c(1,2,3,4),y=as.factor(c("a","b","c","d")),z=c("bar","foo","dat","dot"),stringsAsFactors = F)
str(dfCorrect)
'data.frame':   4 obs. of  3 variables:
 $ x: num  1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: chr  "bar" "foo" "dat" "dot"
## now I have my "wrong" data frame:
dfWrong<-as.data.frame(sapply(dfCorrect,paste,sep=""))
str(dfWrong)
'data.frame':   4 obs. of  3 variables:
 $ x: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: Factor w/ 4 levels "bar","dat","dot",..: 1 4 2 3
I wanted to copy the classes of each column of dfCorrect into dfWrong, but haven't found how to do it properly.
I've tested:
dfWrong1<-dfWrong
dfWrong1[0,]<-dfCorrect[0,]
str(dfWrong1) ## bad result
'data.frame':   4 obs. of  3 variables:
 $ x: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: Factor w/ 4 levels "bar","dat","dot",..: 1 4 2 3
dfWrong1<-dfWrong
str(dfWrong1)<-str(dfCorrect)
'data.frame':   4 obs. of  3 variables:
 $ x: num  1 2 3 4
 $ y: Factor w/ 4 levels "a","b","c","d": 1 2 3 4
 $ z: chr  "bar" "foo" "dat" "dot"
Error in str(dfWrong1) <- str(dfCorrect) : 
  could not find function "str<-"
With this small matrix I could go by hand, but what about larger ones? Is there a way to "copy" the classes from one df to another without having to know the individual classes (and indexes) of each column?
Expected final result (after properly "pasting" classes):
all.equal(sapply(dfCorrect,class),sapply(dfWrong,class))
[1] TRUE
You could try this:
dfWrong[] <- mapply(FUN = as,dfWrong,sapply(dfCorrect,class),SIMPLIFY = FALSE)
...although my first instinct is to agree with Oliver that if it were me I'd try to ensure the correct class at the point you're reading the data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With