I have these columns:
['Campaign', 'Ad group', 'Keyword', 'Status', 'Match type', 'Max. CPC', 'Quality score', 'Impressions', 'Clicks', 'CTR', 'Avg. CPC', 'Cost', 'Avg. position', 'Converted clicks', 'Click conversion rate', 'Cost / converted click', 'Bounce rate', 'Pages / session', 'Avg. session duration (seconds)', '% new sessions']
The error I'm receiving says:
Warning (from warnings module):
File "C:\Python34\lib\site-packages\pandas\io\parsers.py", line 1164
data = self._reader.read(nrows)
DtypeWarning: Columns (5) have mixed types. Specify dtype option on import or set low_memory=False.
What does the Columns (5) part mean? Is that the column position? Does Campaign column start at position 0 or 1?
Also, I suspect this error is because my Max. CPC column has ' --' in a few areas instead of zeros. I want this column datatype to be a float. How do I translate these ' --' to 0.00 and also set this column as a float datatype when reading the CSV?
I've tried:
import pandas as pd
import numpy as np
df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64})
print(df.head())
But get a ValueError:
ValueError: could not convert string to float: ' --'
There are 2 approaches I can think of, one is to pass a list of values that read_csv can consider to treat as NaN values, this would convert those values in the list to be converted to NaN so that the dtype of that column remains as a float and not object:
df = pd.read_csv('file.csv', dtype={'Max. CPC': pd.np.float64}, na_values=[' --'])
You can then convert these NaN values to 0.00 calling fillna:
df['Max. CPC'] = df['Max. CPC'].fillna(0.00)
The other is to load as before and replace these values to 0.00:
df['Max. CPC'] = df['Max. CPC'].replace(' --', 0.00)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With