Hey guys, i have the following problem: 1 process executes a very large query and writes the results to a file, inbetween the process should update an status to the database.
first thaught: NO PROBLEM, pseudo code:
db = mysqldb.connect()
cursor = db.cursor()
large = cursor.execute(SELECT * FROM VERYLARGETABLE)
for result in large.fetchall():
file.write(result)
if timetoUpdateStatus: cursor.execute(UPDATE STATUS)
problem: when getting 9 million results the "large = cursor.execute(SELECT * FROM VERYLARGETABLE)" never finishes... i figured out a border at 2 million entrys at 4 columns where the mysql server finished the query after 30 seconds but the python process keeps running for hours... that maybe a bug in the Python MySQLDB library..
SO SECOND TRY: db.query function with db.use_results() and fetch_row():
db = mysqldb.connect()
cursor = db.cursor()
db.query(SELECT * FROM VERYLARGETABLE)
large = large.use_result()
while true:
for row in large.fetch_row(100000):
file.write(row)
if timetoUpdateStatus: cursor.execute(UPDATE STATUS) <-- ERROR (2014, "Commands out of sync; you can't run this command now")
so THIRD TRY was using 2 MySQL connections... which doesnt work, when i open a second connection the first one disappears....
any suggestions??
Try using a MySQL SSCursor. It will keep the result set in the server (MySQL data structure), rather than transfer the result set to the client (Python data structure) which is what the default cursor does. Using an SSCursor will avoid the long initial delay caused by the default cursor trying to build a Python data structure -- and allocate memory for -- the huge result set. Thus, the SSCursor should also require less memory.
import MySQLdb
import MySQLdb.cursors
import config
cons = [MySQLdb.connect(
host=config.HOST, user=config.USER,
passwd=config.PASS, db=config.MYDB,
cursorclass=MySQLdb.cursors.SSCursor) for i in range(2)]
select_cur, update_cur = [con.cursor() for con in cons]
select_cur.execute(SELECT * FROM VERYLARGETABLE)
for i, row in enumerate(select_cur):
print(row)
if i % 100000 == 0 or timetoUpdateStatus:
update_cur.execute(UPDATE STATUS)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With