Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

“No server available to handle the request” thrown by AWS Elasticsearch Service in Production environment

The application’s production environment started throwing the following error:

ElasticsearchStatusException[Unable to parse response body]; nested: ResponseException[method [POST], host [https://search-production-*.us-west-2.es.amazonaws.com:*, URI [/timerecord… [HTTP/1.1 503 Service Unavailable]. {
  "message": "No server available to handle the request",
}
  • Graph of the JVM memory pressure over the last 3 months.
  • Instance health over the last 3 months
  • Cluster Health Dashboard

No relevant code that interfaces with elastic search has been pushed to production and there has been no significant increase to the amount of data that is running through elastic that would justify this increase. Nevertheless, the increase in JVM memory pressure is clear. Where should I look to investigate this issue further?

I’ve been reading the AWS documentation but am still unsure whether I should scale-up or scale-out.

like image 771
manu_dev Avatar asked Oct 20 '25 16:10

manu_dev


1 Answers

Your problem seems like it is related to a growing terms index. In memory, Lucene "maps prefixes of terms with the offset on disk where the block that contains terms that have this prefix starts".

Even though newer versions of ElasticSearch try to use less memory, we still have to pay a lot of attention to this.

I'm willing to bet that the High CPU usage is just because it's constantly trying to cleanup the exhausted heap space (memory). AWS ElasticSearch Instances allocate half of their memory to heap space.

Wether to scale up, out, or both, depends a lot on your mapping. You'll find some quick relief by scaling up to an instance with more memory, but you'll have to take a deeper look at your mapping and queries to get the best long-term scalability.

It is completely possible that the best solution would be to both scale up and out. If you provide your current instance types, and the number of nodes you are running I might be able to edit this answer to give a more tailored recommendation about how to scale for the short-term.

Elastic search is very picky. The hardware environment it likes to run on, although always memory intensive, varies a lot based your mapping and the types of queries you throw at it. It is likely that, after you get it stable, you'll have to tweak it to find the happy-spot based on your performance:cost:storage needs. Here is a good article about ElasticSearch Scalability and Resilience.

like image 155
Justin Waulters Avatar answered Oct 23 '25 07:10

Justin Waulters