having an issue with handling oddly formatted excel data and writing to CSV in a string format. In my sample data, the excel table I am importing has a column ('Item_Number') and the odd data in the cell looks like: ="0001", ="00201", 2002AA, 1003B.
When I try to output to csv, the results look like: 1, 201, 2002AA, 1003B. When I try to output to excel, the results are correct: 0001, 00201, 2002AA, 1003B.
All of the dtypes are objects. Am I missing a parameter in my .to_csv() command?
df = pd.read_excel(filename,sheetname='Sheet1', converters= {'Item_Number':str})
df.to_csv('Test_csv.csv')
df.to_excel('Test_excel.xlsx')
Tried different iterations of replacing the "=" and " " " but no response.
df.Item_Number.str.replace('=','')
Currently using the excel output but curious if there is a way to preserve string formatting in CSV. Thanks :)
Opening an excel spreadsheet with Python 3 Pandas that has data that looks like ="0001" will go to the dataframe correctly. CSV will turn it back to "1". Keeping the same format to CSV is apparently a known issue (from my comment above). To keep the formatting I have to add =" " back into the data like this:
df['Item_Number'] = '="' + df['Item_Number'] + '"'
Not sure if there is a cleaner version to that will have an Excel opened CSV file show 0001 without the quotes and equals sign.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With