I face a potential race condition in a web application:
# get the submissions so far from the cache
submissions = cache.get('user_data')
# add the data from this user to the local dict
submissions[user_id] = submission
# update the cached dict on server
submissions = cache.update('user_data', submissions)
if len(submissions) == some_number:
...
The logic is simple, we first fetch a shared dictionary stored in the cache of web server, add the submission (delivered by each request to the server) to its local copy, and then we update the cached copy by replacing it with this updated local copy. Finally we do something else if we have received a certain number of pieces of data. Notice that
submissions = cache.update('user_data', submissions)
will return the latest copy of dictionary from the cache, i.e. the newly updated one.
Because the server may serve multiple requests (each in its own thread) at the same time, and all these threads access the shared dictionary in cache as described above, thus creating potential race conditions.
I wonder, in the context of web programming, how should I efficiently handle threading to prevent race conditions in this particular case, without sacrificing too much performance. Some code examples would be much appreciated.
My preferred solution would be to have a single thread that modifies the submissions dict and a queue that feed that thread. If you are paranoid, you can even expose a read-only view on the submissions dict. Using a queue and consumer pattern, you will not have a problem with locking.
Of course, this assumes that you have a web framework that will let you create that thread.
EDIT: multiprocess was not a good suggestion; removed.
EDIT: This sort of stuff is really simple in Python:
import threading, Queue
Stop = object()
def consumer(real_dict, queue):
while True:
try:
item = queue.get(timeout=100)
if item == Stop:
break
user, submission = item
real_dict[user] = submission
except Queue.Empty:
continue
q = Queue.Queue()
thedict={}
t = threading.Thread(target=consumer, args=(thedict,q,))
t.start()
Then, you can try:
>>> thedict
{}
>>> q.put(('foo', 'bar'))
>>> thedict
{'foo': 'bar'}
>>> q.put(Stop)
>>> q.put(('baz', 'bar'))
>>> thedict
{'foo': 'bar'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With