Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lapply and for loop to run a function through a list of data.frames in R

I have a list of data.frame and I'd like to run cor.test through each data.frame. The data.frame has 8 columns, I would like to run cor.test for each of the first 7 columns against the 8th column.

I first set up the lists for storing the data

estimates = list()
pvalues = list()

Then here's the loop combining with lapply

for (i in 1:7){
  corr <- lapply(datalist, function(x) {cor.test(x[,i], x[,8], alternative="two-sided", method="spearman", exact=FALSE, continuity=TRUE)}) 
  estimates= corr$estimate
  pvalues= corr$p.value
}

It ran without any errors but the estimates shows NULL

Which part of this went wrong? I used to run for loop over cor.test or run is with lapply, never put them together. I wonder if there's a solution to this or an alternative. Thank you.

like image 501
Molly_K Avatar asked Nov 21 '25 10:11

Molly_K


2 Answers

We can use sapply, showing with an example on mtcars where cor.test is performed with all columns against the first column.

lst <- list(mtcars, mtcars) 

lapply(lst, function(x) t(sapply(x[-8], function(y) {
   val <- cor.test(y, x[[8]], alternative ="two.sided", 
            method="spearman", exact=FALSE, continuity=TRUE)
          c(val$estimate, pval = val$p.value)
})))

[[1]]
#            rho         pval
#mpg   0.7065968 6.176953e-06
#cyl  -0.8137890 1.520674e-08
#disp -0.7236643 2.906504e-06
#hp   -0.7515934 7.247490e-07
#drat  0.4474575 1.021422e-02
#wt   -0.5870162 4.163577e-04
#qsec  0.7915715 6.843882e-08
#am    0.1683451 3.566025e-01
#gear  0.2826617 1.168159e-01
#carb -0.6336948 9.977275e-05

#[[2]]
#            rho         pval
#mpg   0.7065968 6.176953e-06
#cyl  -0.8137890 1.520674e-08
#.....

This returns you list of two column matrix with estimate and p.value respectively.

like image 134
Ronak Shah Avatar answered Nov 24 '25 00:11

Ronak Shah


Disclaimer: This answer uses the developer version of manymodelr that I also wrote.

EDIT: You can map it to your list of data frames with Map or lapply for instance:

lst <- list(mtcars, mtcars) #Line copied and pasted from @Ronak Shah's answer
Map(function(x) manymodelr::get_var_corr(x, "mpg",get_all = TRUE,
                         alternative="two.sided",
                         method="spearman",
                         continuity=TRUE,exact=F),lst)

For a single data.frame object, we can use get_var_corr:

manymodelr::get_var_corr(mtcars, "mpg",get_all = TRUE,
                         alternative="two.sided",
                          method="spearman",
                          continuity=TRUE,exact=FALSE) 
   #    Comparison_Var Other_Var      p.value Correlation
   # 1             mpg       cyl 4.962301e-13  -0.9108013
   # 2             mpg      disp 6.731078e-13  -0.9088824
   # 3             mpg        hp 5.330559e-12  -0.8946646
   # 4             mpg      drat 5.369227e-05   0.6514555
   # 5             mpg        wt 1.553261e-11  -0.8864220
   # 6             mpg      qsec 7.042244e-03   0.4669358
   # 7             mpg        vs 6.176953e-06   0.7065968
   # 8             mpg        am 8.139885e-04   0.5620057
   # 9             mpg      gear 1.325942e-03   0.5427816
   # 10            mpg      carb 4.385340e-05  -0.6574976
like image 42
NelsonGon Avatar answered Nov 23 '25 23:11

NelsonGon