Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurency issue with SELECT FOR UPDATE in Postgres 9.3

I'm struggling with this issue since two days. We have a solution where multiple worker threads will try to select job requests from a single database/table, by setting a flag on the selected requests and thus effectively blocking the other workers to select the same requests.

I created a java test application to test my queries, but while in normal situations the test executes without issue, in high contention situation (ex. 1 table entry with 50 threads; no delays or processing) I still have threads which obtain the same request/entry, interestingly it happens when the test just starts. I cannot understand why. I've read all relevant Postgres locking and isolation related documentation... While is possible that the issue is with the test application itself, I suspect that I'm missing something about how the SELECT FOR UPDATE works in READ COMMITTED isolation context.

So the question would be can SELECT FOR UPDATE (with READ COMMITED isolation) guarantee that a general concurrency issue like I described can be safely solved?

Acquire query:

UPDATE mytable SET status = 'LOCK'
  WHERE ctid IN (SELECT ctid FROM mytable
    WHERE status = 'FREE'
    ORDER BY id
    LIMIT %d
    FOR UPDATE)
  RETURNING id, status;

Release query:

UPDATE mytable SET status = 'FREE'
  WHERE id = %d AND status = 'LOCK'
  RETURNING id, status;

So would you consider these two queries should be safe, or there is some weird case possible that would allow two threads to acquire the same row? I'd like to mention that I tried also SERIALIZABLE isolation and didn't helped.

like image 736
arpadf Avatar asked Oct 27 '25 03:10

arpadf


1 Answers

Another aspect I'd like so share is regarding the speed of the Acquire query with LIMIT 2. See the test result:

Starting test...
DB setup done
All threads created & connections made
All threads started
Thread[36] 186/190/624=1000
Thread[19] 184/201/615=1000
Thread[12] 230/211/559=1000
Thread[46] 175/200/625=1000
Thread[ 9] 205/211/584=1000
...
Thread[ 4] 189/232/579=1000
Thread[ 3] 185/198/617=1000
Thread[49] 218/204/578=1000
Thread[ 1] 204/203/593=1000
...
Thread[37] 177/163/660=1000
Thread[31] 168/199/633=1000
Thread[18] 174/187/639=1000
Thread[42] 178/229/593=1000
Thread[29] 201/229/570=1000
...
Thread[10] 203/198/599=1000
Thread[25] 215/210/575=1000
Thread[27] 248/191/561=1000
...
Thread[17] 311/192/497=1000
Thread[ 8] 365/198/437=1000
Thread[15] 389/176/435=1000
All threads finished
Execution time: 31408
Test done; exiting

Compare the above with this query :

UPDATE mytable SET status = 'LOCK'
  WHERE id IN (SELECT t1.id FROM (SELECT id FROM mytable 
      WHERE status = 'FREE' ORDER BY id LIMIT 2) AS t1
    FOR UPDATE)
  RETURNING id, status;

and the result:

Starting test...
DB setup done
All threads created & connections made
All threads started
Thread[29] 32/121/847=1000
Thread[22] 61/151/788=1000
Thread[46] 36/114/850=1000
Thread[41] 57/132/811=1000
Thread[24] 49/146/805=1000
Thread[13] 47/135/818=1000
...
Thread[20] 48/118/834=1000
Thread[47] 65/152/783=1000
Thread[18] 51/146/803=1000
Thread[ 8] 69/158/773=1000
Thread[14] 56/158/786=1000
Thread[ 0] 66/161/773=1000
Thread[38] 60/148/792=1000
Thread[27] 69/158/773=1000
...
Thread[45] 78/177/745=1000
Thread[30] 96/162/742=1000
...
Thread[32] 162/167/671=1000
Thread[17] 329/156/515=1000
Thread[33] 337/178/485=1000
Thread[37] 381/172/447=1000
All threads finished
Execution time: 15490
Test done; exiting

Conclusion

The test prints for each thread how many times the Acquire query returned 2, 1 or 0 resources totalling the number of test loops (1000).

From the above results we can conclude that we can speed up the query (halfing the time!) at the cost of increasing the thread contention. This means that we will receive more times 0 resources back from the Acquire query. Technically this is not a problem because we need to treat this situation anyway.

Of course situation changes if you add a wait time (sleeping) when no resources are returned, but choosing a correct value for the wait time depends on the application performance requirements...

like image 156
arpadf Avatar answered Oct 28 '25 17:10

arpadf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!