Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get columns from big sparse csc matrix

I have a sparse matrix X

<1000000x153047 sparse matrix of type '<class 'numpy.float64'>'
with 5082518 stored elements in Compressed Sparse Column format>

and I have an array

columns_to_use 

It consist of 10000 id of columns of matrix X. I want to use only these columns and drop another columns. I try to use such code:

X_new = X[:, columns_to_use]

And it works good with small X (10 000 rows), but with 100 000 rows or more I get memory error. How to get specific columns without memory error?

like image 381
malugina Avatar asked Dec 04 '25 09:12

malugina


1 Answers

I got such decision:

cols = []
for i in columns_to_use:
    cols.append(X[:,i])
X_new = hstack(cols)

it works fast enough and without any erorrs. And it's easy.

like image 100
malugina Avatar answered Dec 07 '25 02:12

malugina