Pandas parse issue with missing zeros after thousands seperator in csv-file

Question

Given a csv file with the following content

actual; shouldbe
1,200;  1200
1,2;    1200
12;     12

I want to read in the content in a way that both columns have equal values. The problem is that there are no trailing zeros after the thousands seperator

df = pd.read_csv(file, sep=';', thousands=',')

leads to

    actual  shouldbe
0   1200    1200
1   12  1200
2   12  12

I hope the problem is clear. I don't have an idea how to clean my data, either in pandas or any other python and non-python tool.

Roman Pekar · Accepted Answer

I'm not sure it could be done without some data cleaning after loading:

>>> s=u"""actual;shouldbe
... 1,200;1200
... 1,001,21;  1001210
... 1,2;   1200
... 12;   12"""
>>> df = pd.read_csv(StringIO(s), sep=";")
>>> df['result'] = df.actual.apply(lambda x: ''.join(k if i==0 else k.ljust(3, '0') for i,k in enumerate(x.split(','))))
>>> df
     actual  shouldbe   result
0     1,200      1200     1200
1  1,001,21   1001210  1001210
2       1,2      1200     1200
3        12        12       12

Pandas parse issue with missing zeros after thousands seperator in csv-file

Tags:

python

pandas

dataframe

csv

Corvince

1 Answers

Roman Pekar

Recent Activity

Donate For Us

Pandas parse issue with missing zeros after thousands seperator in csv-file

Tags:

python

pandas

dataframe

csv

Corvince

1 Answers

Roman Pekar

Related questions

Recent Activity

Donate For Us