Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to create a DataFrame from custom values

I am reading in a text file, on each line there are multiple values. I am parsing them based on requirements using function parse.

def parse(line):
    ......
    ......
    return line[0],line[2],line[5]

I want to create a dataframe, with each line as a row and the three returened values as columns

df = pd.DataFrame()

with open('data.txt') as f:
    for line in f:
       df.append(line(parse(line)))

When I run the above code, I get all values as a single column. Is it possible to get it in proper tabular format.


1 Answers

You shouldn't .append to DataFrame in a loop, that is very inefficient anyway. Do something like:

colnames = ['col1','col2','col3'] # or whatever you want
with open('data.txt') as f:
    df = pd.DataFrame([parse(l) for l in f], columns=colnames)

Note, the fundamental problem is that pd.DataFrame.append expects another data-frame, and it appends the rows of that other data-frame. It interpretes a list as a bunch of single rows. So note, if you structure your list to have "rows" it would work as intended. But you shouldn't be using .append here anyway:

In [6]: df.append([1,2,3])
Out[6]:
   0
0  1
1  2
2  3

In [7]: df = pd.DataFrame()

In [8]: df.append([[1, 2, 3]])
Out[8]:
   0  1  2
0  1  2  3
like image 101
juanpa.arrivillaga Avatar answered Dec 01 '25 20:12

juanpa.arrivillaga