I am importing excel file with 30 columns to dataframe and want to change column type of all the columns to string, how to do this?
data = pd.read_excel(excelPath, sheetname='Donor', converters={'Code':str})
For Pandas 0.20.0+ you can use dtype=object parameter:
data = pd.read_excel(excelPath, sheet_name='Donor', dtype='object')
from docs:
dtype : Type name or dict of column -> type, default None
Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32}
Use object to preserve data as stored in Excel and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.
New in version 0.20.0.
In addition to solution from @Plinus, the following code read all the headers (assuming it is at row 0). It reads 0 row of data.
Using the headers (column names), it creates a dictionary of "column name"-"data conversion function" pairs converters.
It then re-read the whole Excel file using the converters.
columns = pd.read_excel(
'/pathname/to/excel/file.xlsx',
sheet_name='Sheet 1',
nrows=0, # Read 0 rows, assuming headers are at row 0
).columns
converters = {col: str for col in columns} # Convert all fields to strings
data = pd.read_excel(
'/pathname/to/excel/file.xlsx',
sheet_name='Sheet 1',
convertes=converters
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With