I have a file with two datasets in, which I'd like to read into Python as two columns.
The data is in the form:
xxx yyy    xxx yyy   xxx yyy
and so on, so I understand that I need to somehow split it up. I'm new to Python (and relatively new to programming), so I've struggled a bit so far. At the moment I've tried to use:
def read(file):
    column1=[]
    column2=[]
    readfile = open(file, 'r')
    a = (readfile.read())
    readfile.close()
How would I go about splitting the read in file into column1 and column2?
This is quite simple with the Python modules Pandas. Suppose you have a data file like this:
>cat data.txt
xxx  yyy  xxx  yyy  xxx yyy
xxx yyy    xxx yyy   xxx yyy
xxx yyy  xxx yyy   xxx yyy
xxx yyy    xxx yyy  xxx yyy
xxx yyy    xxx  yyy   xxx yyy
>from pandas import DataFrame
>from pandas import read_csv
>from pandas import concat
>dfin = read_csv("data.txt", header=None, prefix='X', delimiter=r"\s+")
> dfin
X0   X1   X2   X3   X4   X5
0  xxx  yyy  xxx  yyy  xxx  yyy
1  xxx  yyy  xxx  yyy  xxx  yyy
2  xxx  yyy  xxx  yyy  xxx  yyy
3  xxx  yyy  xxx  yyy  xxx  yyy
4  xxx  yyy  xxx  yyy  xxx  yyy
>dfout = DataFrame()
>dfout['X0'] = concat([dfin['X0'], dfin['X2'], dfin['X4']], axis=0, ignore_index=True)
>dfout['X1'] = concat([dfin['X1'], dfin['X3'], dfin['X5']], axis=0, ignore_index=True)
> dfout
 X0   X1
 0   xxx  yyy
 1   xxx  yyy
 2   xxx  yyy
 3   xxx  yyy
 4   xxx  yyy
 5   xxx  yyy
 6   xxx  yyy
 7   xxx  yyy
 8   xxx  yyy
 9   xxx  yyy
 10  xxx  yyy
 11  xxx  yyy
 12  xxx  yyy
 13  xxx  yyy
 14  xxx  yyy
Hope it helps. Best.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With