I want to support around 10,000 simultaneous HTTP clients on a small cluster of machines (as small as possible). I'd like to keep a connection to each client alive while the user is using the application, to allow the server to push updates.
I believe that async IO is often recommended for these kinds of long-lived connections, to avoid having lots of threads sitting idle. But what are the issues in having threads sitting idle? I find the threaded model mentally easier to work with, but I don't want to do something that is going to cause me major headaches. I guess I'll have to experiment, but I wondered if anyone knows of any previous experiments along these lines?
Asynchronous I/O basically means that your application does most of the thread scheduling. Instead of letting the OS randomly suspend your thread and schedule another one, you have only as many threads as there are CPU cores and yield to other tasks at the most appropriate points—when the thread reaches an I/O operation, which will take some time.
The above seems like a clear win from the performance standpoint, but the asynchronous programming model is much more complex in several regards:
On the other hand, many favorable improvements and optimizations have happened to the modern OS's which mostly eliminate the performance downsides of synchronous I/O programming:
A classical paper going through much of the above and some other points is a good complement to what I am saying here:
https://www.usenix.org/legacy/events/hotos03/tech/full_papers/vonbehren/vonbehren_html/index.html
There are already some good pointers in the comments of your question.
The reason for not using 10K threads is this costs memory resources, and memory costs energy. The programming model is no argument, because the thread sitting on the client connection, mustn't be the same that wants to post the event.
Please take a look at the websockets standard and the asynchronous request processing model in the Servlet 3.0 standard. All recent java web application servers implement it now (e.g. Glassfish and Tomcat) and it is the solution for your problem.
The question itself cannot be answered since the OS, JVM and application server you use is missing. However, you can test it quite fast by yourself, by just creating a servlet or JSP with Thread.sleep(9999999) and doing siege -c 10000 ... on it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With