Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply converters to all the columns while reading excel file, Python 3.6

I am importing excel file with 30 columns to dataframe and want to change column type of all the columns to string, how to do this?

data = pd.read_excel(excelPath, sheetname='Donor', converters={'Code':str})
like image 226
Learnings Avatar asked Nov 04 '25 09:11

Learnings


2 Answers

For Pandas 0.20.0+ you can use dtype=object parameter:

data = pd.read_excel(excelPath, sheet_name='Donor', dtype='object')

from docs:

dtype : Type name or dict of column -> type, default None

Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32}

Use object to preserve data as stored in Excel and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.

New in version 0.20.0.

like image 88
MaxU - stop WAR against UA Avatar answered Nov 06 '25 00:11

MaxU - stop WAR against UA


In addition to solution from @Plinus, the following code read all the headers (assuming it is at row 0). It reads 0 row of data.

Using the headers (column names), it creates a dictionary of "column name"-"data conversion function" pairs converters.

It then re-read the whole Excel file using the converters.

columns = pd.read_excel(
    '/pathname/to/excel/file.xlsx',
    sheet_name='Sheet 1',
    nrows=0, # Read 0 rows, assuming headers are at row 0
).columns

converters = {col: str for col in columns} # Convert all fields to strings
data = pd.read_excel(
    '/pathname/to/excel/file.xlsx',
    sheet_name='Sheet 1',
    convertes=converters        
)
like image 39
yoonghm Avatar answered Nov 06 '25 00:11

yoonghm