I am trying to find the indices of nonzero entries by row in a sparse matrix: scipy.sparse.csc_matrix. So far, I am looping over each row in the matrix, and using
numpy.nonzero()
to each row to get the nonzero column indices. But this method would take over an hour to find the nonzero column entries per row. Is there a fast way to do so? Thanks!
N = nnz( X ) returns the number of nonzero elements in matrix X .
getnnz , which is the number of nonzero terms of a sparse matrix.
nnz returns the number of nonzero elements in a sparse matrix. nonzeros returns a column vector containing all the nonzero elements of a sparse matrix. nzmax returns the amount of storage space allocated for the nonzero entries of a sparse matrix.
Use the .nonzero() method. 
indices = sp_matrix.nonzero()
If you'd like the indices as (row, column) tuples, you can use zip.
indices = zip(*sp_matrix.nonzero())
It is relatively straightforward for a CSR matrix, so you can always do:
>>> a = sps.rand(5, 5, .2, format='csc')
>>> a.A
array([[ 0.        ,  0.        ,  0.68642384,  0.        ,  0.        ],
       [ 0.46120599,  0.        ,  0.83253467,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.07074811],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.21190832,  0.        ,  0.        ,  0.        ]])
>>> b = a.tocsr()
>>> np.split(b.indices, b.indptr[1:-1])
[array([2]), array([0, 2]), array([4]), array([], dtype=float64), array([1])]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With