Creating a matrix from CSV file

Question

I've been working on Python for around 2 months now so I have a OK understanding of it.

My goal is to create a matrix using CSV data, then populating that matrix from the data in the 3rd column of that CSV file.

I came up with this code thus far:

import csv

import csv
def readcsv(csvfile_name):
      with open(csvfile_name) as csvfile:
      file=csv.reader(csvfile, delimiter=",")

#remove rubbish data in first few rows

      skiprows = int(input('Number of rows to skip? '))
           for i in range(skiprows):
                _ = next(file)

#change strings into integers/floats

            for z in file:
                 z[:2]=map(int, z[:2])
                 z[2:]=map(float, z[2:])
                 print(z[:2])
        return

After removing the rubbish data with the above code, the data in the CSV file looks like this:

   Input:
   1  1  51 9 3 
   1  2  39 4 4
   1  3  40 3 9
   1  4  60 2 . 
   1  5  80 2 .
   2  1  40 6 .
   2  2  28 4 .
   2  3  40 2 .
   2  4  39 3 . 
   3  1  10 . .
   3  2  20 . .
   3  3  30 . .
   3  4  40 . .
   .  .   . . .

The output should look like this:

      1   2   3   4  .  .
   1  51  39  40  60
   2  40  28  40  39
   3  10  20  30  40
   .
   .

There are about a few thousand rows and columns in this CSV file, however I'm only interested is the first 3 columns of the CSV file. So the first and second columns are basically like co-ordinates for the matrix, and then populating the matrix with data in the 3rd column.

After lots of trial and error, I realised that numpy was the way to go with matrices. This is what I tried thus far with example data:

  left_column =   [1, 2, 1, 2, 1, 2, 1, 2]
  middle_column = [1, 1, 3, 3, 2, 2, 4, 4]
  right_column =  [1., 5., 3., 7., 2., 6., 4., 8.]

  import numpy as np
  m = np.zeros((max(left_column), max(middle_column)), dtype=np.float)
  for x, y, z in zip(left_column, middle_column, right_column):
      x -= 1 # Because the indicies are 1-based
      y -= 1 # Need to be 0-based
      m[x, y] = z
  print(m)

  #: array([[ 1., 2., 3., 4.],
  #:        [ 5., 6., 7., 8.]])

However, it is unrealistic for me to specify all of my data in my script to generate the matrix. I tried using the generators to pull the data out of my CSV file but it didn't work well for me.

I learnt as much numpy as I could, however it appears like it requires my data to already be in matrix form, which it isn't.

Scott Griffiths · Accepted Answer

You should seriously consider using pandas. It is really ideal for this sort of work. I can't give you an actual solution because I don't have your data, but I would try something like the following:

import pandas as pd
df = pd.read_csv('test.csv', usecols=[0,1,2], names=['A', 'B', 'C'])
pd.pivot_table(df, index='A', columns='B', values='C')

The second line imports the data into a pandas DataFrame object (change the names into something more useful for your application). The pivot table creates the matrix you are looking for, and gracefully handles any missing data.

Saullo G. P. Castro · Answer

You can use scipy.sparse.coo_matrix to load this data very conveniently.

Working with your input:

 Input:
   1  1  51 9 3 
   1  2  39 4 4
   1  3  40 3 9
   1  4  60 2 . 
   1  5  80 2 .
   2  1  40 6 .
   2  2  28 4 .
   2  3  40 2 .
   2  4  39 3 . 
   3  1  10 . .
   3  2  20 . .
   3  3  30 . .
   3  4  40 . .
   .  .   . . .

You could do:

l, c, v = np.loadtxt('test.txt', skiprows=1).T
m = coo_matrix((v, (l-1, c-1)), shape=(l.max(), c.max()))

Then you can convert the coo_matrix to a np.ndarray:

In [9]: m.toarray()
Out[9]:
array([[ 51.,  39.,  40.,  60.,  80.],
       [ 40.,  28.,  40.,  39.,   0.],
       [ 10.,  20.,  30.,  40.,   0.]])

Creating a matrix from CSV file

Tags:

python

parsing

csv

matrix

numpy

dizzyLife

2 Answers

Scott Griffiths

Saullo G. P. Castro

Recent Activity

Donate For Us

Creating a matrix from CSV file

Tags:

python

parsing

csv

matrix

numpy

dizzyLife

2 Answers

Scott Griffiths

Saullo G. P. Castro

Related questions

Recent Activity

Donate For Us