Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP: Multithreaded PHP / Web Services?

Greetings All!

I am having some troubles on how to execute thousands upon thousands of requests to a web service (eBay), I have a limit of 5 million calls per day, so there are no problems on that end.

However, I'm trying to figure out how to process 1,000 - 10,000 requests every minute to every 5 minutes.

Basically the flow is: 1) Get list of items from database (1,000 to 10,000 items) 2) Make a API POST request for each item 3) Accept return data, process data, update database

Obviously a single PHP instance running this in a loop would be impossible.

I am aware that PHP is not a multithreaded language.

I tried the CURL solution, basically: 1) Get list of items from database 2) Initialize multi curl session 3) For each item add a curl session for the request 4) execute the multi curl session

So you can imagine 1,000-10,000 GET requests occurring...

This was ok, around 100-200 requests where occurring in about a minute or two, however, only 100-200 of the 1,000 items actually processed, I am thinking that i'm hitting some sort of Apache or MySQL limit?

But this does add latency, its almost like performing a DoS attack on myself.

I'm wondering how you would handle this problem? What if you had to make 10,000 web service requests and 10,000 MySQL updates from the return data from the web service... And this needs to be done in at least 5 minutes.

I am using PHP and MySQL with the Zend Framework.

Thanks!

like image 607
cappuccino Avatar asked Jun 02 '26 02:06

cappuccino


2 Answers

I've had to do something similar, but with Facebook, updating 300,000+ profiles every hour. As suggested by grossvogel, you need to use many processes to speed things up because the script is spending most of it's time waiting for a response. You can do this with forking, if your PHP install has support for forking, or you can just execute another PHP script via the command line.

exec('nohup /path/to/script.php >> /tmp/logfile 2>&1 & echo $!'), $processId);

You can pass parameters (getopt) to the php script on the command line to tell it which "batch" to process. You can have the master script do a sleep/check cycle to see if the scripts are still running by checking for the process id's. I've tested up to 100 scripts running at once in this manner, at which point the CPU load can get quite high.

Combine multiple processes with multi-curl, and you should easily be able to do what you need.

like image 147
Brent Baisley Avatar answered Jun 03 '26 15:06

Brent Baisley


My two suggestions are (a) do some benchmarking to find out where your real bottlenecks are and (b) use batching and cacheing wherever possible.

Mysqli allows multiple-statement queries, so you could definitely batch those database updates.

The http requests to the web service are more likely the culprit, though. Check the API you're using to see if you can get more info from a single call, maybe? To break up the work, maybe you want a single master script to shell out to a bunch of individual processes, each of which makes an api call and stores the results in a file or memcached. The master can periodically read the results and update the db. (Careful to rotate the data store for safe reading and writing by multiple processes.)

like image 25
grossvogel Avatar answered Jun 03 '26 15:06

grossvogel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!