I have a matrix A in CSC-format, of which I index just a single column

```
b = A[:,col]
```

resulting in a (n x 1) matrix. What I want to do is:

```
v = M * b
```

where M is a (n x n) matrix in CSR. The result v is a (n x 1) CSR-matrix. I need to iterate the values in v (not including the 0s actually) and retrieve the index of one element meeting a special criteria (note: sparse matrix formats were not chosen to fit that particular operation, but general matrix x matrix-products should be fastest with CSR * CSC, right?)

The problem is, that iterating the entries in the CSR-formatted vector (0 < i < n: v[i,0]) is terribly slow and I actually waste quite some memory since v is not sparse anymore.

Could anyone tell me how to perform these operations as such, that I can quickly iterate over the result vector, keeping the copy-related memory overhead small?

```
IN: M (CSR-Matrix), A (CSC-Matrix), col_index
v = M * A[:,col_index]
for entries in v:
do stuff
```

Is it also possible to somehow speed up "advanced" indexing over columns in a CSC-Matrix? At some other point in the code, I have to extract a submatrix of A (cannot be reformulated to allow for slicing, therefore using an index array), that includes a given subset of all columns. A[:,idxlist] takes quite long when line-profiling.

Looking forward to your suggestions

The scipy sparse module is getting better every release, but it is quite obviously work in progress, so there is a lot of optimization you can do by accessing the innards of the objects directly. E.g. your case:

```
>>> a = sps.rand(5, 20, density=0.2, format='csr')
>>> b = sps.rand(20, 1, density=0.2, format='csc')
>>> c = a * b
>>> c.A
array([[ 0.30331594],
[ 0. ],
[ 0.12198742],
[ 0.34350077],
[ 0. ]])
```

You can get the non-zero entries of `c`

as `c.data`

:

```
>>> c.data
array([ 0.30331594, 0.12198742, 0.34350077])
```

Getting the corresponding row number is a little trickier. Probably the easiest would be to convert your output to CSC format, since them you have them directly as `c.indices`

, and `c.data`

will still be the same as before:

```
>>> c.tocsc().indices
array([0, 2, 3])
>>> c.tocsc().data
array([ 0.30331594, 0.12198742, 0.34350077])
```

But you can extract them without doing the conversion if you don't fancy it:

```
>>> np.where(c.indptr[:-1] != c.indptr[1:])[0]
array([0, 2, 3], dtype=int64)
```

So if you wanted to find, e.g. the largest value and its row number, you could do:

```
>>> row_idx = np.where(c.indptr[:-1] != c.indptr[1:])[0]
>>> idx = np.argmax(c.data)
>>> c.data[idx], row_idx[idx]
(0.34350077450601624, 3)
```

In a Code Review problem I am exploring ways of speeding up the iteration over the rows of a sparse matrix, http://codereview.stackexchange.com/questions/32664/numpy-scipy-optimization/33566#33566

`csr`

`getrow`

is surprisingly slow. At least for that small test case it is faster to convert the sparse matrix to a dense array, and use regular numpy indexing (use `np.nonzero`

to get sparse entries). It is equally fast to convert the matrix to `lil`

, and do regular Python iteration on `zip(X.data, X.rows)`

.

My impression is that `scipy.sparse`

is best for linear algebra problems, and slow for indexing and iteration.

Similar Questions

I have a scipy.sparse.csr.csr_matrix that represents words in a document and a list of lists where each index represents the categories for each index in the matrix. The problem that I am having is t

Well, Trying to do something with search engines. I have generated a matrix (term-document) from a collection of 5 documents. The output is: docs= (5,1) 1.0000 (1,2) 0.7071 (3,2) 0.7071 (1,3) 0.7071

Question is very simple: Let's say I have a given row r from scipy sparse matrix M (100,000X500,000), I want to find its location/index in the M matrix? How can I accomplish this in an efficient way?

Given a sparse matrix listing, what's the best way to calculate the cosine similarity between each of the columns (or rows) in the matrix? I would rather not iterate n-choose-two times. Say the input

I have a large matrix that I would like to convert to sparse CSR format. When I do: import scipy as sp Ks = sp.sparse.csr_matrix(A) print Ks Where A is dense, I get (0, 0) -2116689024.0 (0, 1) 39462

Given an arbitrary numpy array (ndarray), is there a function or a short way to convert it to a scipy.sparse matrix? I'd like something that works like: A = numpy.array([0,1,0],[0,0,0],[1,0,0]) S = t

I'm working on implementing the stochastic gradient descent algorithm for recommender systems using sparse matrices with Scipy. This is how a first basic implementation looks like: N = self.model.sha

I have a large sparse matrix X in scipy.sparse.csr_matrix format and I would like to multiply this by a numpy array W making use of parallelism. After some research I discovered I need to use Array in

I am using Scipy sparse matrix csr_matrix to be used as context vectors in word-context vectors. My csr_matrix is a (1, 300) shape so it is a 1-dimensional vector. I need to use permutation (circular

I am getting different results when Randomized PCA with sparse and dense matrices: import numpy as np import scipy.sparse as scsp from sklearn.decomposition import RandomizedPCA x = np.matrix([[1,2,3,

I need to store word co-occurrence counts in several 14000x10000 matrices. Since I know the matrices will be sparse and I do not have enough RAM to store all of them as dense matrices, I am storing th

I'm integrating a system of stiff ODE's using SciPy's integrate.odeint function. As the integration is non-trivial and time consuming I'm also using the corresponding jacobian. By rearranging the equa

I have a large sparse matrix (lil) implemented in Python with scipy, which comprises of Users on one axis, and songs they played on the other. So each row is a linked list of the songs that user has p

import numpy as np from scipy.sparse import lil_matrix using numpy I get test_mat = (np.ones((4,6))) test_list = test_mat[0,:].tolist() gives test_list as a list which has 6 elements. However whe I

I'm trying to convert some code to Python but I noticed that SciPy's sparse diagonal operations are having some trouble handling systems that are diagonal. For example the following code can be writt

I am working with large sparse matrices (sparse). I have a large sparse matrix of values which needs to be included into a larger sparse matrix. I have an array of logicals which indicates which rows

I have a large sparse matrix and I want to get the maximum value for each row. In numpy, I can call numpy.max(mat, axis=1), but I can not find similar function for scipy sparse matrix. Is there any ef

I'm using scipy.linalg to solve a matrix equation A*x = b The following code does not work: from scipy import * from pylab import * from scipy.sparse import lil_matrix from scipy.sparse.linalg import

Let A be a sparse matrix in coordinate format [row(int) col(int) val(float)]. If a upper triangular sparse matrix of A is needed, then the same can be obtained using logical indexing like: A = A(A(:,1

I'm looking for a Sparse Matrix library I can use from Ruby. I'm currently using the GNU Scientific Library bindings provided by the gsl gem, but my application would be better optimized if I used a

I need a library to solve Ax=b systems, where A is a non-symmetric sparse matrix, with 8 entry per row (and it might be quite big). I think a library that implements biconjugate gradient should be fin

for class I have to write my own linear equation solver for sparse matrices. I am free to use any type of data structure for sparse matrices and I have to implement several solves, including conjuguat

It seems to me that in SAGE the only difference between creating a dense matrix and a sparse matrix is by the flag passed to the constructor (sparse = True). In particular, this means that if I want

I am implementing a sparse matrix based on the Stack class, and I'm getting the following error: Sparse.java:6: Sparse is not abstract and does not override abstract method pop() in Stack public clas

I'm trying to cluster some data with python and scipy but the following code does not work for reason I do not understand: from scipy.sparse import * matrix = dok_matrix((en,en), int) for pub in pubs:

Say that I have a sparse matrix in scipy.sparse format. How can I extract a diagonal other than than the main diagonal? For a numpy array, you can use numpy.diag. Is there a scipy sparse equivalent? F

Given a scipy.sparse.crs_matrix, I would like to extract the submatrix that in Numpy's dense algebra would be expressed as A[0::2, 0::2] i.e., A_{new}(i,j) = A(2*i,2*j) (chessboard black-squares ma

I have a ndarray (Z) with some 500000 elements on a rectangular grid (X, Y). Now I want to interpolate values at some 100 locations in x,y which are not necessarily on the grid. I have some code wor

I have a very large and very sparse matrix, composed of only 0s and 1s. I then basically handle (row-column) pairs. I have at most 10k pairs per row/column. My needs are the following: Parallel inser

I am using Scipy to construct a large, sparse (250k X 250k) co-occurrence matrix using scipy.sparse.lil_matrix. Co-occurrence matrices are triangular; that is, M[i,j] == M[j,i]. Since it would be high

I have a large sparse matrix A, and I would like to create a sparse matrix of the 3X3 block diagonals of A. How would I do this? keep in mind that A is very large and sparse, so any methods that use i

I am trying to use the very recent capability of the RcppArmadillo package (version 0.3.910.0 with R 3.0.1 and evrerything up to date) for conversion of a sparse matrix from the Matrix package (class

let's say I have a big Matrix X with a lot of zeros, so of course I make it sparse in order to save on memory and CPU. After that I do some stuff and at some point I want to have the nonzero elements.

I would like to extract specific rows and columns from a scipy sparse matrix - probably lil_matrix will be the best choice here. It works fine here: from scipy import sparse lilm=sparse.lil_matrix((10

I work on converting a large Matlab code to C++ and CUDA. I have problems converting some sparse matrix operations like: 1. full_Matrix * sparse_Matrix 2. sparse_Matrix * full_Matrix 3. sparse_Matrix

I am looking for a C library to solve linear and, if possible, nonlinear matrix equation of the form Ax = b. It is important to me, that the packages are not too big and free of charge. Speed does not

I have sparse CSR matrices (from a product of two sparse vector) and I want to convert each matrix to a flat vector. Indeed, I want to avoid using any dense representation or iterating over indexes. S

I am using Python with numpy, scipy and scikit-learn module. I'd like to classify the arrays in very big sparse matrix. (100,000 * 100,000) The values in the matrix are equal to 0 or 1. The only thing

I've got a scipy.sparse_matrix A and I want to zero-out a decently-sized fraction of the elements. (In the matrices I'm working with today, A has about 70M entries and I want to zero-out about 700K of

I have a sparse matrix that represents a 3D rectangular space. Along some of the boundaries, I know what the value is going to be (it's a constant). The other boundaries may be reflective, differentia

How can I print sparse L and U matrices calculated by splu, which uses SuperLU? My MWE: >>> import scipy >>> import scipy.sparse >>> import scipy.sparse.linalg >>>

I have a sparse matrix that is not symmetric I.E. the sparsity is somewhat random, and I can't count on all the values being a set distance away from the diagonal. However, it is still sparse, and I w

ok , i don't think, i can explain this problem in words so , here is the snippet of ipython session , where i import scipy , in order to construct a sparse matrix. In [1]: import scipy as sp In [2]: a

I need a command to check for zero sparse matrix, isempty(..) does not work. Is there some sparse version of isempty(..)? >> mlf2=sparse([],[],[],2^31+1,1) mlf2 = All zero sparse: 2147483649-by-

I am new to the use of sparse matrices, but now need to utilize one in my work to save space. I understand that the following matrix: 10 0 0 0 -2 0 3 9 0 0 0 3 0 7 8 7 0 0 3 0 8 7 5 0 0 8 0 9 9 13 0 4

I see 2 implementations of sparse matrix in this package. OpenMapRealMatrix SparseFieldMatrix Both are documented as Sparse matrix implementation based on an open addressed map. Do you know what a

I wanted CSR files preferably from matrix market for my OpenCL library, I searched a lot for CSR generators in C but didn't get any. I find matrix market formats comfortable since they have defined th

I have a list of sparse vectors (in R). I need to convert this list to a sparse matrix. Doing it via a for-loop takes a long time. sm<-spMatrix(length(tc2),n.col) for(i in 1:length(tc2)){ sm[i,]<

In my project, I'm trying to build an adjacency matrix for a graph, and for space and time considerations we are supposed to use a sparse matrix, which, from my understanding, is most easily done with

How can I create a sparse matrix from a list of dimension names? Suppose you have this matrix edgelist in a data frame: from to weight 1 4 a 1 2 5 b 2 3 6 c 3 It can be created like this: from <-