I have a large scipy sparse symmetric matrix which I need to condense by taking the sum of blocks to make a new smaller matrix.

For example, for a 4x4 sparse matrix A I will like to make a 2x2 matrix B in which B[i,j] = sum(A[i:i+2,j:j+2]).

Currently, I just go block by block to recreate the condensed matrix but this is slow. Any ideas on how to optimize this?

**Update:** Here is an example code that works fine, but is slow for a sparse matrix of 50.000x50.000 that I want to condense in a 10.000x10.000:

```
>>> A = (rand(4,4)<0.3)*rand(4,4)
>>> A = scipy.sparse.lil_matrix(A + A.T) # make the matrix symmetric
>>> B = scipy.sparse.lil_matrix((2,2))
>>> for i in range(B.shape[0]):
... for j in range(B.shape[0]):
... B[i,j] = A[i:i+2,j:j+2].sum()
```

For a 4x4 example you can do the following:

```
In [43]: a = np.arange(16.).reshape((4, 4))
In [44]: a
Out[44]:
array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.],
[ 12., 13., 14., 15.]])
In [45]: u = np.array([a[:2, :2], a[:2, 2:], a[2:,:2], a[2:, 2:]])
In [46]: u
Out[46]:
array([[[ 0., 1.],
[ 4., 5.]],
[[ 2., 3.],
[ 6., 7.]],
[[ 8., 9.],
[ 12., 13.]],
[[ 10., 11.],
[ 14., 15.]]])
In [47]: u.sum(1).sum(1).reshape(2, 2)
Out[47]:
array([[ 10., 18.],
[ 42., 50.]])
```

Using something like itertools it should be possible to automate and generalise an expression for `u`

.

Given a square matrix of size *N* and a split size of *d* (so matrix will be partitioned into *N/d* * *N/d* sub-matrices of size *d*), could you use `numpy.split`

a couple times to build a collection of those sub-matrices, sum each of them, and put them back together?

This should be treated more as pseudocode than an efficient implementation, but it expresses my idea:

```
def chunk(matrix, size):
row_wise = []
for hchunk in np.split(matrix, size):
row_wise.append(np.split(hchunk, size, 1))
return row_wise
def sum_chunks(chunks):
sum_rows = []
for row in chunks:
sum_rows.append([np.sum(col) for col in row])
return np.array(sum_rows)
```

Or more compactly as

```
def sum_in_place(matrix, size):
return np.array([[np.sum(vchunk) for vchunk in np.split(hchunk, size, 1)]
for hchunk in np.split(matrix, size)])
```

This gives you something like the following:

```
In [16]: a
Out[16]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [17]: chunk.sum_in_place(a, 2)
Out[17]:
array([[10, 18],
[42, 50]])
```

First of all, `lil`

matrix for the one your are summing up is probably really bad, I would try `COO`

or maybe `CSR/CSS`

(I don't know which will be better, but `lil`

is probably inherently slower for many of these operations, even the slicing might be much slower, though I did not test). (Unless you know that for example `dia`

fits perfectly)

Based on `COO`

I could imagine doing some tricking around. Since `COO`

has `row`

and `col`

arrays to give the exact positions:

```
matrix = A.tocoo()
new_row = matrix.row // 5
new_col = matrix.col // 5
bin = (matrix.shape[0] // 5) * new_col + new_row
# Now do a little dance because this is sparse,
# and most of the possible bin should not be in new_row/new_col
# also need to group the bins:
unique, bin = np.unique(bin, return_inverse=True)
sum = np.bincount(bin, weights=matrix.data)
new_col = unique // (matrix.shape[0] // 5)
new_row = unique - new_col * (matrix.shape[0] // 5)
result = scipy.sparse.coo_matrix((sum, (new_row, new_col)))
```

(I won't guarantee that I didn't confuse row and column somewhere and this only works for square matrices...)

Similar Questions

I have in a matrix data that contains 0 and 'proper' data. I want to crete several matrixes (new ones) to separate the data from the zeros and get rid of these zeros. So, here is a simple example: da

I have a transform matrix with values like this. Transform: xx, xy, yx, yy, tx, and ty respectively. How can I find the angle from the above set of gives values.

What's the best way to apply something like scipy.misc.logsumexp to a sparse matrix (for instance a scipy.sparse.csr_matrix), specifying one axis? The point is to leave the zeros out from the computat

I have a square matrix a below as an example see below. matrix a, is nxn square matrix. a = matrix( c(1, 5 , 3, 7 , 3, 5, 1, 2, 2, 4, 3, 2 , 1, 2,4, 7, 2, 2,1,3, 2, 4,4 ,3 , 1 ),ncol = 5,nrow =5) I

How to the rotate value from a translated/scaled/rotated matrix? Matrix matrix = new Matrix(); matrix.postScale(...); matrix.postTranslate(...); matrix.postRotate(...); ... Now I don't know what the

I have a large sparse matrix, implemented as a lil sparse matrix from sci-py. I just want a statistic for how sparse the matrix is once populated. Is there a method to find out this?

I'm using numpy and scipy. I have a large sparse matrix and I want to find the largest eigenvalue of the sparse matrix. How can I do that?

I am trying to use the very recent capability of the RcppArmadillo package (version 0.3.910.0 with R 3.0.1 and evrerything up to date) for conversion of a sparse matrix from the Matrix package (class

I have two dimensional matrix. My matrix is sparse. I am facing performance problem. Can any body please answer that what api or class i can use in java to handle sparse matrix to improve my program p

I have a matrix. And I need to get 1D arrays from my matrix. For example, I have follow matrix: 123 456 789 So it looks like 3 arrays: 147, 258, 369. But I got Index out of range exception in this c

I'm trying to use large 10^5x10^5 sparse matrices but seem to be running up against scipy: n = 10 ** 5 x = scipy.sparse.rand(n, n, .001) gets ValueError: Trying to generate a random sparse matrix su

I've been searching everywhere and I've only found how to create a covariance matrix from one vector to another vector, like cov(xi, xj). One thing I'm confused about is, how to get a covariance matri

I have a matrix A which contains the alpha parameters for my beta distributions and A^2 contains the beta parameters. I want to get a matrix C which contains simulations from the beta distribution, wi

The following code runs too slowly even though everything seems to be vectorized. from numpy import * from scipy.sparse import * n = 100000; i = xrange(n); j = xrange(n); data = ones(n); A=csr_matrix

I have a sparse matrix A, generated as an output of glmnet function. When I print the matrix A, it shows all the entries and at the top it reads - 1897 x 100 sparse Matrix of class dgCMatrix [[ su

I have a matrix NxN. I want to extract values only from odd columns in this matrix. http://photoload.ru/data/38/5d/d2/385dd20f148fd21a08de36a9c03e69a1.jpg And after generate new matrix by this values

Is it possible to speed up large sparse matrix calculations by e.g. placing parantheses optimally? What I'm asking is: Can I speed up the following code by forcing Matlab to do the operations in a spe

I have a m x n matrix where each row consists of zeros and same values for each row. an example would be: M = [-0.6 1.8 -2.3 0 0 0; 0 0 0 3.4 -3.8 -4.3; -0.6 0 0 3.4 0 0] In this example the first co

I am optimizing code which heavily relies on a custom made Matrix library (which won't be excluded from the project because it is everywhere. This is not nice, but it's a fact...) Many calculations ar

I have a very large and sparse matrix of size 180GB(text , 30k * 3M) containing only the entries and no additional data. I have to do matrix multiplication , inversion and some similar linear algebra

Does anyone know a good way how i can extract blocks from an Eigen::VectorXf that can be interpreted as a specific Eigen::MatrixXf without copying data? (the vector should contains several flatten mat

I am using eigen 3.1.0-alpha1 as solver for a my first little software. I need to return a sparse matrix from a method of a class: SparseMatrix KMDMatrix::Assembly(double ***p_objs){ SparseMatrix <

for class I have to write my own linear equation solver for sparse matrices. I am free to use any type of data structure for sparse matrices and I have to implement several solves, including conjuguat

How do you populate an empty matrix with the values of another matrix? The empty matrix: > m1 <- matrix(ncol=8, nrow=8) > rownames(m1) <- c('a','b','c','d','e','f','g','h') > colnames(m

Finding the maximum sum subrectangle in an NxN matrix can be done in O(n^3) time using 2-d kadane's algorithm, as pointed out in other posts. However, if the matrix is sparse, specifically O(n) non-ze

I want to initialise a sparse matrix (for use with scipy minimum_spanning_tree if that matters) from a list of matrix coordinates and values. That is, I have: coords - Nx2 array of coordinates to be s

I have the following matrix which I believe is sparse. I tried converting to dense using the x.dense format but it never worked. Any suggestions as to how to do this?, thanks. mx=[[(0, 2), (1, 1), (2,

I am trying to create a large sparse matrix, 10^5 by 10^5 in R, but am running into memory issues. > Matrix(nrow=1e5,ncol=1e5,sparse=TRUE) Error in Matrix(nrow = 1e+05, ncol = 1e+05, sparse = TRUE)

I'm involved in the resolution of a system of the type Ax = b, where A is a square sparse matrix, x is the vector of the unknows (I have to compute it) and b is a vector of all zeros excpet for the la

What is the best way to efficiently remove columns from a sparse matrix that only contain zeros. I have a matrix which I have created and filled with data: matrix = sp.sparse.lil_matrix((100, 100)) I

I'm wondering whether there is a way to simply add a dense vector to all the rows of a sparse matrix represented as a csr_matrixin scipy.sparse and returning a sparse matrix, ie trying to sum only the

I have two lists v and w and I would like to create again a list z from matrix M . How can I do this in R? v = list(a = c(1, 5), b = 2, c= 3) w = list( a= c(2, 10), b = 4, c = 6) M = as.matrix(unlist

I have a very large and very sparse matrix, composed of only 0s and 1s. I then basically handle (row-column) pairs. I have at most 10k pairs per row/column. My needs are the following: Parallel inser

I have a sparse matrix. I need to sort this matrix row-by-row and create another [sparse] matrix. Code may explain it better: # for `rand` function, you need newer version of scipy. from scipy.sparse

I'm trying to cluster some data with python and scipy but the following code does not work for reason I do not understand: from scipy.sparse import * matrix = dok_matrix((en,en), int) for pub in pubs:

I am trying to find the dot product between a scipy sparse matrix and a numpy.ndarray. tensor refers to theano.tensor. X is the sparse matrix and W_hidden is the ndarray. b_hidden is also ndarray. te

I can define a sparse Matrix using a vector for i, j, and x: i <- c(1,3:8) j <- c(2,9,6:10) x <- 7 * (1:7) (A <- sparseMatrix(i, j, x = x)) I want to extract the i, j, and x elements from

I'm thinking of using Boost's Sparse Matrix for a computation where minimal memory usage is the goal. Unfortunately, the documentation page didn't include a discussion of the sparse matrix implementat

I am creating a matrix from a Pandas dataframe as follows: dense_matrix = numpy.array(df.as_matrix(columns = None), dtype=bool).astype(np.int) And then into a sparse matrix with: sparse_matrix = scip

I have over a thousand matrices (6 x 2000, ASCII files, comma delimited) that I generated from MATLAB. I want to get the last row of each matrix / text file and save them in a new matrix / text file.

Are there any storage optimized Sparse Matrix implementations in C#?

I see with new Eigen 3.2, you can get row, column or even block from a sparse matrix, is there a way to set any of these to 0? Eigen::SparseMatrix<float, Eigen::RowMajor> A( 5, 5 ); A.block(1, 1

I need to calculate the cumulative sum of a matrix which is that where value of each index (i,j) of the new cumulative sum matrix is sum of all the elements formed by the sub-matrix (0,0) to (i,j) of

I have a scipy.sparse.csr.csr_matrix which is the output from TfidfVectorizer() class. I know I can access the individual components of this matrix in this manner: So if I have this matrix here: tf_id

I am working on a project that involves the computation of the eigenvectors of a very large sparse matrix. To be more specific I have a Matrix that is the laplacian of a big graph and I am interested

I have a 35x2 matrix (randomwords); and I have randomly selected 8 rows (rndm). What I need to do is remove the 8 selected rows from the randomwords matrix and save this new 27x2 matrix under a new va

Is there any way that I can sum up columns values for each group of three rows in a matrix? I can sum three rows up in a manual way. For example % matrix is the one I wanna store the new data. % data

I need to do matrix operations (mainly multiply and inverse) of a sparse matrix SparseMat in OpenCV. I noticed that you can only iterate and insert values to SparseMat. Is there an external code I can

I have a matrix store as follows rowid, columnid, value I want to read only a chunk of rows and send it to a mapper. For example, rows with id= 1,2,3,4 to a mapper, 5,6,7,8 to another one, ... Is i

I have a code where I'm reading 1024x1024 float matrix from disk then I'm getting some elements of it and doing some process on the new matrix as follows. // mask is the 1Kx1K matrix that 1/64 elemen