I have a dataframe that have about 200M rows with example like this:
Date         tableName    attributeName
29/03/2019   tableA       attributeA
....
and I want to save the dataframe to a table in MySQL database. This is what I've tried to insert the dataframe to table:
def insertToTableDB(tableName,dataFrame):
    mysqlCon = mysql.connector.connect(host='localhost',user='root',passwd='')
    cursor = mysqlCon.cursor()
    for index, row in dataFrame.iterrows():
        myList =[row.Date, row.tableName, row.attributeName]
        query = "INSERT INTO `{0}`(`Date`, `tableName`, `attributeName`) VALUES (%s,%s,%s);".format(tableName)
        cursor.execute(query,myList)
        print(myList)
    try:
        mysqlCon.commit()
        cursor.close()        
        print("Done")
        return tableName,dataFrame
    except:
        cursor.close()
        print("Fail")
This code successful when I inserted a dataframe that have 2M rows. But, when I inserted dataframe that have 200M rows, I got error like this:
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\cursor.py", line 569, in execute
self._handle_result(self._connection.cmd_query(stmt))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 553, in cmd_query
result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
File "C:\Users\User\Anaconda3\lib\site-packages\mysql\connector\connection.py", line 442, in _handle_result
raise errors.get_exception(packet)
ProgrammingError: Unknown column 'nan' in 'field list'
My dataframe doesn't have 'nan' value. Could someone help me to solve this problem?
Thank you so much.
replace everywhere 'NaN' for the string 'empty':
df = df.replace(np.nan, 'empty')
Remember to:
import numpy as np
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With