Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python mysqldb multiple connections

Hey guys, i have the following problem: 1 process executes a very large query and writes the results to a file, inbetween the process should update an status to the database.

first thaught: NO PROBLEM, pseudo code:

db = mysqldb.connect()
cursor = db.cursor()
large = cursor.execute(SELECT * FROM VERYLARGETABLE)
for result in large.fetchall():
     file.write(result)
if timetoUpdateStatus: cursor.execute(UPDATE STATUS)

problem: when getting 9 million results the "large = cursor.execute(SELECT * FROM VERYLARGETABLE)" never finishes... i figured out a border at 2 million entrys at 4 columns where the mysql server finished the query after 30 seconds but the python process keeps running for hours... that maybe a bug in the Python MySQLDB library..

SO SECOND TRY: db.query function with db.use_results() and fetch_row():

db = mysqldb.connect()
cursor = db.cursor()
db.query(SELECT * FROM VERYLARGETABLE)
large = large.use_result()
while true:
    for row in large.fetch_row(100000):
        file.write(row)
    if timetoUpdateStatus: cursor.execute(UPDATE STATUS) <-- ERROR (2014, "Commands out of sync; you can't run this command now")

so THIRD TRY was using 2 MySQL connections... which doesnt work, when i open a second connection the first one disappears....

any suggestions??

like image 580
TekTimmy Avatar asked Dec 07 '25 12:12

TekTimmy


1 Answers

Try using a MySQL SSCursor. It will keep the result set in the server (MySQL data structure), rather than transfer the result set to the client (Python data structure) which is what the default cursor does. Using an SSCursor will avoid the long initial delay caused by the default cursor trying to build a Python data structure -- and allocate memory for -- the huge result set. Thus, the SSCursor should also require less memory.

import MySQLdb
import MySQLdb.cursors
import config

cons = [MySQLdb.connect(
    host=config.HOST, user=config.USER,
    passwd=config.PASS, db=config.MYDB,
    cursorclass=MySQLdb.cursors.SSCursor) for i in range(2)]
select_cur, update_cur = [con.cursor() for con in cons]
select_cur.execute(SELECT * FROM VERYLARGETABLE)
for i, row in enumerate(select_cur):
    print(row)
    if i % 100000 == 0 or timetoUpdateStatus:
        update_cur.execute(UPDATE STATUS)
like image 128
unutbu Avatar answered Dec 10 '25 01:12

unutbu