Python – How to define (n, 0) sparse matrix in scipy or how to assemble sparse matrix by column?

How to define (n, 0) sparse matrix in scipy or how to assemble sparse matrix by column?… here is a solution to the problem.

How to define (n, 0) sparse matrix in scipy or how to assemble sparse matrix by column?

I have a loop that gives me a sparse matrix N in each iteration of column c.

To assemble/grow/accumulate N column by column, I thought of using

N = scipy.sparse.hstack([N, c]) 

To do this, it is better to initialize the matrix with rows of length 0. But,

N = scipy.sparse.csc_matrix((4,0))

Throws ValueError: invalid shape.

What are the suggestions and how to do this correctly?

Solution

You can’t. Compared to NumPy arrays, sparse matrices are limited, specifically not allowing 0 to be used for any axis. All sparse matrix constructors check for this, so if you do manage to build such a matrix, you’re taking advantage of the SciPy error, and your script may break while upgrading SciPy.

That being said, I don’t see why you need an n × 0 sparse matrix, since n × 0 NumPy arrays are allowed and take up little storage space.

It turns out that sparse.hstack can’t handle NumPy arrays with zero lines, so please ignore my previous comment. However, I think what you should do is collect all the columns in the list and then hstack them in one call. This is better than your loop, because append to a list requires amortization constant time, while hstack takes linear time. Therefore, the algorithm you propose may be linear but requires quadratic time.

Related Problems and Solutions