Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas read_csv. How to ignore delimiter before line break

I'm reading a file with numerical values.

data = pd.read_csv('data.dat', sep=' ', header=None)

In the text file, each row end with a space, So pandas wait for a value that is not there and add a "nan" at the end of each row. For example:

2.343 4.234

is read as: [2.343, 4.234, nan]

I can avoid it using , usecols = [0 1] but I would prefer a more general solution

like image 251
heracho Avatar asked Oct 21 '25 08:10

heracho


1 Answers

You can use regular expressions in your sep argument.

Instead of specifying the separator to be one space, you can ask it to use as a separator any number of spaces until it finds the next value. You can do this by using the regular expression \s+:

data = pd.read_csv('data.dat', sep='\s+', header=None)
like image 65
Harry Avatar answered Oct 23 '25 20:10

Harry



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!