I have got an output using sparse matrix in python, i need to store this sparse matrix in my hard disk, how can i do it? if i should create a database then how should i do?? this is my code:

```
import nltk
import cPickle
import numpy
from scipy.sparse import lil_matrix
from nltk.corpus import wordnet as wn
from nltk.corpus import brown
f = open('spmatrix.pkl','wb')
def markov(L):
count=0
c=len(text1)
for i in range(0,c-2):
h=L.index(text1[i])
k=L.index(text1[i+1])
mat[h,k]=mat[h,k]+1//matrix
cPickle.dump(mat,f,-1)
text = [w for g in brown.categories() for w in brown.words(categories=g)]
text1=text[1:500]
arr=set(text1)
arr=list(arr)
mat=lil_matrix((len(arr),len(arr)))
markov(arr)
f.close()
```

I need to store this "mat" in a file and should access the value of the matrix using the co-ordinates..

result of the sparse matrix is like this: `the result of sparse matrix are like this:

```
(173, 168) 2.0 (173, 169) 1.0 (173, 172) 1.0 (173, 237) 4.0 (174, 231) 1.0 (175, 141) 1.0 (176, 195) 1.0
```

but when i store it into a file and read the same i'm getting it like this:

```
(0, 68) 1.0 (0, 77) 1.0 (0, 95) 1.0 (0, 100) 1.0 (0, 103) 1.0 (0, 110) 1.0 (0, 112) 2.0 (0, 132) 1.0 (0, 133) 2.0 (0, 139) 1.0 (0, 146) 2.0 (0, 156) 1.0 (0, 157) 1.0 (0, 185) 1.0
```

pyTables is the Python interface to HDF5 data model and is pretty popular choice for and well-integrated with NumPy and SciPy. pyTables will let you access slices of databased arrays without needing to load the entire array back into memory.

I don't have any specific experience with sparse matrices per se and a quick Google search neither confirmed nor denied that sparse matrices are supported.

Adding on the HDF5 support, Python also has NetCDF support which is ideal for matrix form data storage and quick access both sparse and dense. It is included in Python-x,y for windows, which a lot of scientific users of python end up with.

More numpy based examples can be found in this cookbook.

For very big sparse matrices on clusters, you might use pytrilinos, it has a HDF5 interface which can dump a sparse matrix to disk, and works also if the matrix is distributed on different nodes.

http://trilinos.sandia.gov/packages/pytrilinos/development/EpetraExt.html#input-output-classes

Depending on the size of the sparse matrix, I tend to just use `cPickle`

to pickle the array:

```
import cPickle
f = open('spmatrix.pkl','wb')
cPickle.dump(your_matrix,f,-1)
f.close()
```

If I'm dealing with really large datasets then I tend to use `netcdf4-python`

**Edit:**

To then access the file again you would:

```
f = open('spmatrix.pkl','rb') # open the file in read binary mode
# load the data in the .pkl file into a new variable spmat
spmat = cPickle.load(f)
f.close()
```

**Note**: This answer is in response to the revise question that now provides code.

You should not call `cPickle.dump()`

in your function. Create the sparse matrix and then dump its contents to the file.

Try:

```
def markov(L):
count=0
c=len(text1)
for i in range(0,c-2):
h=L.index(text1[i])
k=L.index(text1[i+1])
mat[h,k]=mat[h,k]+1 #matrix
text = [w for g in brown.categories() for w in brown.words(categories=g)]
text1=text[1:500]
arr=set(text1)
arr=list(arr)
mat=lil_matrix((len(arr),len(arr)))
markov(arr)
f = open('spmatrix.pkl','wb')
cPickle.dump(mat,f,-1)
f.close()
```

For me, using the `-1`

option in `cPickle.dump`

function caused the pickled file to not be loadable afterwards.

The object I dumped through `cPickle`

was an instance of `scipy.sparse.dok_matrix`

.

Using only two arguments did the trick for me; documentation about `pickle.dump()`

states the default value of the `protocol`

parameter is `0`

.

Working on Windows 7, Python 2.7.2 (64 bits), and `cPickle`

v 1.71.

Example:

```
>>> import cPickle
>>> print cPickle.__version__
1.71
>>> from scipy import sparse
>>> H = sparse.dok_matrix((135, 654), dtype='int32')
>>> H[33, 44] = 8
>>> H[123, 321] = -99
>>> print str(H)
(123, 321) -99
(33, 44) 8
>>> fname = 'dok_matrix.pkl'
>>> f = open(fname, mode="wb")
>>> cPickle.dump(H, f)
>>> f.close()
>>> f = open(fname, mode="rb")
>>> M = cPickle.load(f)
>>> f.close()
>>> print str(M)
(123, 321) -99
(33, 44) 8
>>> M == H
True
>>>
```

Similar Questions

I'd like to know how I can do matrix addition in Python, and I'm running into quite a number of roadblocks trying to figure out the best way. Here's the problem, written as best as I can formulate it

I'm trying to create a python program to perform the strassen and regular matrix multiplication methods. However, when I try to run my strassen function with the randomly generated matrix created with

I am using Python 3.23 and I am want to multiply a sparse VECTOR with a dense MATRIX. The idea of first unfolding the sparse vector into a dense one and then multiplying is of course silly from any st

I am using OpenCV in Python. I am trying to create a Mask Matrix variable to be used in this function: cv.minMaxLoc. The Matrix variable should have the same size like my template image, with type = C

When it comes to organizing python modules, my Mac OS X system is a mess. I've packages lying around everywhere on my hdd and no particular system to organize them. How do you keep everything manageab

I'm trying to manipulate some data in a sparse matrix. Once I've created one, how do I add / alter / update values in it? This seems very basic, but I can't find it in the documentation for the sparse

I have a sparse matrix. I need to sort this matrix row-by-row and create another [sparse] matrix. Code may explain it better: # for `rand` function, you need newer version of scipy. from scipy.sparse

I have a sparse matrix from the sklearn bag-of-words vectorizer. It's a csr_matrix and its elements represent word frequency in a document. But now what I need is the 0/1 matrix where 1 represents the

How do you create a weak reference to an object in Python?

I am examining java version sparse matrix multiplication program which is from JGF benchmark. I run this program in many kinds of cpu frequency. I also do some profile for this program. I classify it

How do I check if a file exists, using Python, without using a try: statement?

Trying to get all the diagonal elements of a NXN matrix without using numpy, This is different from Get diagonal without using numpy in Python Please don't mark as duplicate. Here is my snippet using

I need to do matrix operations (mainly multiply and inverse) of a sparse matrix SparseMat in OpenCV. I noticed that you can only iterate and insert values to SparseMat. Is there an external code I can

This question already has an answer here: Get a filtered list of files in a directory 5 answers How do you remove files given a filespec such as *.obj on Windows? I'm using Windows 7 and 8.1

I am looking for a way to perform a digit divided by larger value(2/5000000) and then store that value in table, but the problem is when i save that value, only 0 is stored , instead of correct value.

I have to make a survey using html and python. The html is finished but recording the answers to the survey using python does not work properly. I tried programming it such that when the radio buttons

I have a module wrote in python to input a matrix that looks like that: matrix = [] loop = True while loop: line = input() if not line: #the way it works is that you enter value separated by a space a

I'm using python and the Image module(PIL) to process images. I want to store the raw bits stream of the image object to redis so that others can directly read the images from redis using nginx &

I'm using redhat 5.8, which comes with python 2.4 installed automatically, but I'm using a python package that requires python 2.6 or higher. SO, I installed python 2.7 alongside 2.4, so as to not ste

Possible Duplicate: Reverse the ordering of words in a string I know there are methods that python already provides for this, but I'm trying to understand the basics of how those methods work when y

I'm trying to transpose a sparse matrix in c++. I'm struggling with the traversal of the new transposed matrix. I want to enter everything from the first row of the matrix to the first column of the n

I find many similar questions but no answer. For simple array there is multiprocessing.Array. For sparse matrix or any other arbitrary object I find manager.namespace. So I tried the code below: from

I am doing some computations on a sparse matrix of floats in the log domain, so the empty entries are actually -Inf (using -FLT_MAX). I'm using a custom sparse matrix class right now but I am eager

I got some sparse matrix like this >>>import numpy as np >>>from scipy.sparse import * >>>A = csr_matrix((np.identity(3))) >>>print A (0, 0) 1.0 (1, 1) 1.0 (2, 2) 1

It takes 0.02 seconds for Matlab to compute the inverse of a diagonal matrix using the sparse command. P = diag(1:10000); P = sparse(P); tic; A = inv(P); toc However, for the Python code it takes for

How do you save/load a scipy sparse csr_matrix in a portable format? The scipy sparse matrix is created on Python 3 (Windows 64-bit) to run on Python 2 (Linux 64-bit). Initially, I used pickle (with p

How do you do an os.path.join with an array in python? Basically, I want to be able to run that command with an array as an argument. Any help is highly appreciated.

I'm looking for a Sparse Matrix library I can use from Ruby. I'm currently using the GNU Scientific Library bindings provided by the gsl gem, but my application would be better optimized if I used a

What am I doing wrong here? I want to element-wise multiply two sparse matrices using Colt. Here's an example of how I'm attempting to do this: DoubleMatrix2D A = new SparseDoubleMatrix2D(2, 2); A.se

I have the following matrix which I believe is sparse. I tried converting to dense using the x.dense format but it never worked. Any suggestions as to how to do this?, thanks. mx=[[(0, 2), (1, 1), (2,

i have a python interface using wxpython which allows the user to fill in a matrix (0/1) and then graphs it for them. The program creates a numpy matrix, then makes a networkx graph out of that matrix

I calculated a confusion matrix for my classifier using the method confusion_matrix() from the sklearn package. The diagonal elements of the confusion matrix represent the number of points for which t

I'm trying to save a lot of data that needs to be separated in to different files like so data_1.dat data_2.dat data_3.dat data_4.dat how do I implement this in python?

I want to multiply a sparse matrix A, with a matrix B which has 0, -1, or 1 as elements. To reduce the complexity of the matrix multiplication, I can ignore items if they are 0, or go ahead and add th

I'm fairly new to Python and have been using Wing IDE to play around with the features. One of the things that I could find while looking around was how to force terminate the Python shell when execut

let's say I have a big Matrix X with a lot of zeros, so of course I make it sparse in order to save on memory and CPU. After that I do some stuff and at some point I want to have the nonzero elements.

I wish to speed up my machine learning algorithm (written in Python) using Numba (http://numba.pydata.org/). Note that this algorithm takes as its input data a sparse matrix. In my pure Python impleme

I have a sparse matrix that represents a 3D rectangular space. Along some of the boundaries, I know what the value is going to be (it's a constant). The other boundaries may be reflective, differentia

I am working with sparse matrices which are 11685 by 85730 . I am able to store it only as a .pickle file . I want to view the file outside the python environment also . I tried saving as a .txt and .

How do I validate XML document via compact RELAX NG schema in Python?

I want to make a sparse matrix in python. I have the index and value of non-zero elements as a dictionary i.e.: {((1,3),0.0001),(10,4),0.0212)...} which means that value of element (1,3) is 0.0001, (

Hi guys I am new to python and would appreciate some help! I have multiple strings like this: 21357.53 84898.10 Mckenzie Meadows Golf Course 80912.48 84102.38 And I am trying to figure out how to sp

how do you import or insert an image using python.Basically I want to know how to import an image and how to select the file and folder its in using python.

I work on converting a large Matlab code to C++ and CUDA. I have problems converting some sparse matrix operations like: 1. full_Matrix * sparse_Matrix 2. sparse_Matrix * full_Matrix 3. sparse_Matrix

I'm trying to figure out how to iterate through a scipy sparse matrix by column. I'm trying to compute the sum of each column, then weight the members of that column by that sum. What I want to do is

https://github.com/andymccurdy/redis-py I know in ruby we use the quit() method. I can't find anything here for python python: import redis r = redis.StrictRedis(host='localhost', port=6379, db=0) r.s

I was wondering if there is a operator for element-wise multiplication of rows of a sparse matrix with a vector in scipy.sparse library. Something similar to A*b for numpy arrays? Thanks.

How do you install blaze natively (i.e., not in a virtual environment) in Python? The only instructions I find are on in the package's doc (see link), and here, in a virtual environment.

Possible Duplicate: Python Music Library? Is there a way to play musical notes in Python? Also setting duration would be useful. If there are any built-in modules for it then that will be great, but

The documentation I've run across researching this indicates that the way to do it for other databases is to use multiple statements in your query, a la: >>> cursor = connection.cursor() >