Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveRecord Postgres database not locking - getting race conditions

I'm struggling with locking a PostgreSQL table I'm working on. Ideally I want to lock the entire table, but individual rows will do as long as they actually work.

I have several concurrent ruby scripts that all query a central jobs database on AWS (via a DatabaseAccessor class), find a job that hasn't yet been started, change the status to started and carry it out. The problem is, since these are all running at once, they'll typically all find the same unstarted job at once, and begin carrying it out, wasting time and muddying the results.

I've tried a bunch of things, .lock, .transaction, the fatalistic gem but they don't seem to be working, at least, not in pry.

My code is as follows:

class DatabaseAccessor
  require 'pg'
  require 'pry'
  require 'active_record'
  class Jobs < ActiveRecord::Base
    enum status: [ :unstarted, :started, :slow, :completed]
  end

  def initialize(db_credentials)
    ActiveRecord::Base.establish_connection(
      adapter:  db_credentials[:adapter],
      database: db_credentials[:database],
      username: db_credentials[:username],
      password: db_credentials[:password],
      host:     db_credentials[:host]
    )
  end

  def find_unstarted_job
    job = Jobs.where(status: 0).limit(1)
    job.started!
    job
  end
end

Does anyone have any suggestions?

EDIT: It seems that LOCK TABLE jobs IN ACCESS EXCLUSIVE MODE; is the way to do this - however, I'm struggling with then returning the results of this after updating. RETURNING * will return the results after an update, but not inside a transaction.

like image 867
Joe Avatar asked Dec 01 '25 06:12

Joe


1 Answers

SOLVED!

So the key here is locking in Postgres. There are a few different table-level locks, detailed here.

There are three factors here in making a decision:

  1. Reads aren't thread safe. Two threads reading the same record will result in that job being run multiple times at once.
  2. Records are only updated once (to be marked as completed) and created, other than the initial read and update to being started. Scripts that create new records will not read the table.
  3. Reading varies in frequency. Waiting for an unlock is non-critical.

Given these factors, if there were a read-lock that still allowed writes, this would be acceptable, however, there isn't, so ACCESS EXCLUSIVE is our best option.

Given this, how do we deal with locking? A hunt through the ActiveRecord documentation gives no mention of it.

Thankfully, other methods to deal with PostgreSQL exist, namely the ruby-pg gem. A bit of a play with SQL later, and a test of locking, and I get the following method:

def converter
  result_hash = {}
  conn = PG::Connection.open(:dbname => 'my_db')
  conn.exec("BEGIN WORK;
      LOCK TABLE jobs IN ACCESS EXCLUSIVE MODE;")
  conn.exec("UPDATE jobs SET status = 1 WHERE id = 
      (SELECT id FROM jobs WHERE status = 0 ORDER BY ID LIMIT 1)
      RETURNING *;") do |result|
          result.each { |row| result_hash = row }
      end
  conn.exec("COMMIT WORK;")
  result_hash.transform_keys!(&:to_sym)
end

This will result in:

  • An output of an empty hash if there are no jobs with a status of 0

  • An output of a symbolized hash if one is found and updated

  • Sleeping if the database is currently locked, before returning the above once unlocked.

The table will remain locked until the COMMIT WORK statement.

As an aside, I wish there was a cleaner way to convert the result to a hash. If anyone has any suggestions, please let me know in the comments! :)

like image 197
Joe Avatar answered Dec 03 '25 23:12

Joe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!