I'm new in python and I have a problem. I have some measured data saved in a txt file. the data is separated with tabs, it has this structure:
0 0 -11.007001 -14.222319 2.336769
i have always 32 datapoints per simulation (0,1,2,...,31) and i have 300 simulations (0,1,2...,299), so the data is sorted at first with the number of simulation and then the number of the data point.
The first column is the simulation number, the second column is the data point number and the other 3 columns are the x,y,z coordinates.
I would like to create a 3d array, the first dimension should be the simulation number, the second the number of the datapoint and the third the three coordinates.
I already started a bit and here is what I have so far:
## read file
coords = [x.split('\t') for x in
open(f,'r').read().replace('\r','')[:-1].split('\n')]
## extract the information you want
simnum = [int(x[0]) for x in coords]
npts = [int(x[1]) for x in coords]
xyz = array([map(float,x[2:]) for x in coords])
but I don't know how to combine these 2 lists and this one array.
in the end i would like to have something like this:
array = [simnum][num_dat_point][xyz]
thanks for your help.
I hope you understand my problem, it's my first posting in a python forum, so if I did anything wrong, I'm sorry about this.
thanks again
you can combine them with zip function, like so:
for sim, datapoint, x, y, z in zip(simnum, npts, *xyz):
# do your thing
or you could avoid list comprehensions altogether and just iterate over the lines of the file:
for line in open(fname):
lst = line.split('\t')
sim, datapoint = int(lst[0]), int(lst[1])
x, y, z = [float(i) for i in lst[2:]]
# do your thing
to parse a single line you could (and should) do the following:
coords = [x.split('\t') for x in open(fname)]
This seems like a good opportunity to use itertools.groupby.
import itertools
import csv
file = open("data.txt")
reader = csv.reader(file, delimiter='\t')
result = []
for simnumberStr, rows in itertools.groupby(reader, key=lambda t: t[0]):
simData = []
for row in rows:
simData.append([float(v) for v in row[2:]])
result.append(simData)
file.close()
This will create a 3 dimensional list named 'result'. The first index is the simulation number, and the second index is the data index within that simulation. The value is a list of integers containing the x, y, and z coordinate.
Note that this assumes the data is already sorted on simulation number and data number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With