Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I calculate the row variance for large matrices?

Tags:

r

matrix

I have a pretty large sparse matrix in R:

> dim(matrix)
[1] 60675 36807

Now, I wanted to calculate the row variance for this matrix using like this:

apply(matrix,1,var)
Error in asMethod(object) : 
    Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 102

I suppose my matrix is too large for this to work. My solution was to use multiple cores for this issue like this:

mclapply(Matrix::t(matrix), var, mc.cores=16)
Error in asMethod(object) : 
    Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 102    

But as you can see, I get the same error again. Do you have any suggestions on how to handle this problem? Maybe subset the matrix and then calculate the variance?

like image 818
Alex Avatar asked Dec 13 '25 10:12

Alex


2 Answers

The sparseMatrixStats package on Bioconductor implements the matrixStats API for Matrix objects;

# BiocManager::install("sparseMatrixStats")
library(sparseMatrixStats)

mat <- matrix(0, nrow=10, ncol=6)
mat[sample(seq_len(60), 4)] <- 1:4
sparse_mat <- as(mat, "dgCMatrix")
class(sparse_mat)
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix"

rowVars(sparse_mat)
#> [1] 0.0000000 0.1666667 1.5000000 0.0000000 0.0000000 0.0000000 0.0000000
#> [8] 0.0000000 2.8000000 0.0000000
like image 129
HenrikB Avatar answered Dec 15 '25 22:12

HenrikB


Example:

library(Matrix)
d <- Matrix(0, nrow=60675, ncol=36087)
d[5,5] <- 1

Unfortunately this doesn't work:

library(matrixStats)
v <- rowVars(d)

Error in rowVars(d) : Argument 'x' must be a matrix or a vector.

But we can do it by brute force:

v <- numeric(nrow(d))
for (i in seq(nrow(d))) {if (i %% 250 == 0) cat("."); v[i] <- var(d[i,]) }

This is slow (about 60 seconds on an old-ish MacOS desktop) but works. Parallelizing helps:

library(parallel)
f <- function(i) var(d[i,])
v <- unlist(mclapply(seq(nrow(d)), f, mc.cores = 4))

(22 seconds elapsed).

like image 21
Ben Bolker Avatar answered Dec 15 '25 22:12

Ben Bolker