I have a sparse matrix X
<1000000x153047 sparse matrix of type '<class 'numpy.float64'>'
with 5082518 stored elements in Compressed Sparse Column format>
and I have an array
columns_to_use
It consist of 10000 id of columns of matrix X. I want to use only these columns and drop another columns. I try to use such code:
X_new = X[:, columns_to_use]
And it works good with small X (10 000 rows), but with 100 000 rows or more I get memory error. How to get specific columns without memory error?
I got such decision:
cols = []
for i in columns_to_use:
cols.append(X[:,i])
X_new = hstack(cols)
it works fast enough and without any erorrs. And it's easy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With