I have a SQL Server table full of orders that my program needs to "follow up" on (call a webservice to see if something has been done with them). My application is multi-threaded, and could have instances running on multiple servers. Currently, every so often (on a Threading timer), the process selects 100 rows, at random (ORDER BY NEWID()), from the list of "unconfirmed" orders and checks them, marking off any that come back successfully.
The problem is that there's a lot of overlap between the threads, and between the different processes, and their's no guarantee that a new order will get checked any time soon. Also, some orders will never be "confirmed" and are dead, which means that they get in the way of orders that need to be confirmed, slowing the process down if I keep selecting them over and over.
What I'd prefer is that all outstanding orders get checked, systematically. I can think of two easy ways do this:
Are there any other ideas I'm missing? Does this even make sense? Let me know if I need some clarification.
RESULT:
What I ended up doing was adding a LastCheckedForConfirmation column to my table with finished orders in it, and I added a stored procedure that updates a single, Unconfirmed row with GETDATE() and kicks out the order number so my process can check on it. It spins up as many of these as it can (given the number of threads the process is willing to run), and uses the stored procedure to get a new OrderNumber for each thread.
To handle the "Don't try rows too many times or when they're too old" problem, I did this: The SP will only return a row if "Time since last try" > "Time between creation and last try", so each time it will take twice as long before it tries again - first it waits 5 seconds, then 10, then 20, 40, 80, 120, and then after it's tried 15 times (6 hours), it gives up on that order and the SP will never return it again.
Thanks for the help, everybody - I knew the way I was doing it was less than ideal, and I appreciate your pointers in the right direction.
I recommend read and internalize Using tables as Queues.
If you use the data as a queue, you must organize it properly for queuing operations. The article I linked goes into details about how to do this, what you have is a variant of a Pending Queue.
One thing you must absolutely get rid of is the randomness. If there is one thing that is hard to reproduce in a query, is randomness. ORDER BY NEWID() will scan every row, generate a guid, then SORT, and then give you back top 100. You cannot, under any circumstances, have every worker thread scan the entire table every time, you'll kill the server as the number of unprocessed entries grows.
Instead use pending processing date. Have the queue be organized (clustered) by processing date column (when the item is due for retry) and dequeue using the techniques I show in my linked article. If you want to retry, the dequeue should postpone the item instead of deleting it, ie. WITH (...) UPDATE SET due_date = dateadd(day, 1, getutcdate()) ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With