Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lapply and apply for each component and element of a list R

I have this list:

lst

lst <- list(a=c(2.5,9.8,5.0,6.7,6.5,5.2,34.4, 4.2,39.5, 1.3,0.0,0.0,4.1,0.0,0.0,25.5,196.5, 0.0,104.2,0.0,0.0,0.0,0.0,0.0),b=c(147.4,122.9,110.2,142.3))

I would like to calculate for each values of a list and for each element of a list (a and b) a z.score as: (x[i]-mean(x)/sd(x), where x are all values (togheter) of each element of a list and x[i] each single component of each list element. I tried with lapply

lapply(lst,function (x) as.data.frame(apply(x,2, function(y)- lapply(lst,mean)/lapply(lst,sd))))

but there is an error... maybe with for loop as:

lst.new <- vector("list",1)

for (i in 1:length(lst)){
  for (j in 1:dim(data.frame(lst[i]))[1]){
    res[j] <- (as.numeric(unlist(lst[i]))[j]-mean(as.numeric(unlist(lst[i])))/
      sd(as.numeric(unlist(lst[i])))
    lst.new[[i]] <- res
  }
}

but the result is strange (sure I'm wrong in the lst.new output):

[[1]]
 [1] -0.3635464 -0.1982809 -0.3069486 -0.2684621 -0.2729899 -0.3024208  0.3586413 -0.3250599  0.4741007 -0.3907133
[11] -0.4201442 -0.4201442 -0.3273238 -0.4201442 -0.4201442  0.1571532  4.0284412 -0.4201442  1.9388512 -0.4201442
[21] -0.4201442 -0.4201442 -0.4201442 -0.4201442

[[2]]
 [1]  0.9671130 -0.4517055 -1.1871746  0.6717671 -0.2729899 -0.3024208  0.3586413 -0.3250599  0.4741007 -0.3907133
[11] -0.4201442 -0.4201442 -0.3273238 -0.4201442 -0.4201442  0.1571532  4.0284412 -0.4201442  1.9388512 -0.4201442
[21] -0.4201442 -0.4201442 -0.4201442 -0.4201442

the expected result can be a list or a data frame with different length as:

 a       b
   -0.36    0.967113
  -0.19     -0.45
    [...]  [...]

and so on...

P.S: 
 0.36 == (2.5- mean(unlist(lst[1])))/sd(unlist(lst[1]))
 0.967113 == (147.4 -mean(unlist(lst[2])))/sd(unlist(lst[2]))

It's better for me to use lapply (or his family function) and to resolve the problem

like image 950
skylobo Avatar asked Dec 08 '25 11:12

skylobo


1 Answers

Just for completeness' sake, if there wasn't the scale function @akrun pointed out, your code should have been:

lapply(lst,function(x) x-mean(x)/sd(x)) 

all those lapplys within applys mean you're trying to calculate the mean and sd of individual values...

Let's work through it step by step. lapply takes lst and breaks it down into elements. Each element in turn is given as the argument to your anonymous function. That means the function gets a vector of numbers. Then, using R's vectorization, what we do is calculate for every element of the vector the result of that element, minus the mean of the whole vector divided by the sd of the whole vector.

Compare that with what happens in your code:

lapply(lst,function (x) as.data.frame(apply(x,2, function(y)- lapply(lst,mean)/lapply(lst,sd))))

So the first lapply breaks lst and sends the vectors one at a time to your function.

The function then has to break the vector down by columns (apply with dimension argument 2) - which is where it throws the error. But even if it succeeded to just break down the vector into elements, you then have two more lapplys that break down that single element and calculate the mean and sd for them individually.

like image 62
iod Avatar answered Dec 10 '25 03:12

iod