I have a design headache here, I'm using PHP and MySQL in conjunction with Java (my project is an Android application). I have to decide how to run a series of server side calculations at regular intervals. There is a wealth of material here on SO addressing how to create cron jobs and so on, and that's great, I may very well end there, but I'm not sure about how to tackle this part of my project in a broader sense.
The application is completely centred upon the geographic locations of users. They're always organised in clusters of anywhere between 4 and 40, and these clusters form one instance record in my database. These instances can become active or inactive at any time.
For each record in my database, or, I prefer instance, at each epoch, I want to recompute the centroid of the instance from its user locations (that's easy enough, particularly using a scalar approach given their close proximity), effectively shifting the location of the instance itself by updating latitude and longitude values in the database for the instance. Users will subsequently receive these new instance centroid coordinates at regular intervals when they call home.
This is where it gets messy due to my rank inexperience. I started out by writing a relatively simple calculation involving one SQL select query and one subsequent SQL update operation, for each instance, at each epoch. If we assume an update interval of around 20-30 seconds for now, that's less than one minute, apparently this breaches a limitation of 1min for cron jobs. (It should be noted that the time difference between epochs can be hardcoded, if absolutely necessary).
In the short term, this process might only take a negligible amount of time to execute, due to the fact there would be very few instances/clusters. However, it would potentially stack up to a lot of SQL queries and a lot of time to process all of the calculations at some point later if the number of instances ran into the thousands... In order to reduce unnecessary load, I naturally want to incorporate some mechanism to exclude inactive instances, though I guess it is still conceivable that the required calculation time could exceed the epoch interval. I guess that's an issue for (much) later.
As it stands now, the question is two-fold:
My current approach is as follows:
Is the above approach sound? At this point, I plan to do it this way unless there's a better suggestion. I really don't have a solid handle on how I'm going to schedule the task to execute at each epoch (Point #4), however... I've looked all over the place and I can't solve this myself without some guidance, I'm just not very good yet. :) As always, any suggestions would be greatly appreciated.
You might consider moving from a scheduled task to an update as needed approach. This is fairly easy to accomplish, but there are tradeoffs.
Add a datetime field called Last Updated
Every time you query the object, check the last updated field for
"freshness" (in your case, if it was > than 30 seconds ago)
If its fresh, send the data to the user.
If it isn't fresh, recalculate the data and save it to the database
(making sure to change the last updated field). Then, send the new
data to the user.
This will eliminate the need for a scheduled task & get rid of the waste of updating every row. However, it can slow down responses to the user.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With