How to define (n, 0) sparse matrix in scipy or how to assemble sparse matrix by column?
I have a loop that gives me a sparse matrix N
in each iteration of column c
.
To assemble/grow/accumulate N
column by column, I thought of using
N = scipy.sparse.hstack([N, c])
To do this, it is better to initialize the matrix with rows of length 0. But,
N = scipy.sparse.csc_matrix((4,0))
Throws ValueError: invalid shape
.
What are the suggestions and how to do this correctly?
Solution
You can’t. Compared to NumPy arrays, sparse matrices are limited, specifically not allowing 0
to be used for any axis. All sparse matrix constructors check for this, so if you do manage to build such a matrix, you’re taking advantage of the SciPy error, and your script may break while upgrading SciPy.
That being said, I don’t see why you need an n × 0 sparse matrix, since n × 0 NumPy arrays are allowed and take up little storage space.
It turns out that sparse.hstack
can’t handle NumPy arrays with zero lines, so please ignore my previous comment. However, I think what you should do is collect all the columns in the list and then hstack
them in one call. This is better than your loop, because append
to a list requires amortization constant time, while hstack
takes linear time. Therefore, the algorithm you propose may be linear but requires quadratic time.