Vespa.ai exploit multiple instances to answer queries

Question

I have Vespa.ai cluster with multiple container/content nodes. After Vespa is loaded with data, my app sends queries and gets the data from Vespa. I want to be sure that I utilize well all the nodes and I get the data as fast as possible. My app builds HTTP request and sends it to one of the nodes.
Which node/nodes should I direct my request to? How can I be sure that all instances participate in answering queries?
What should I do to utilize all the cluster nodes?
Does Vespa know to load balance these requests to other instances for better performance?

Jon · Accepted Answer

Vespa is a 2-tier system:

enter image description here

The containers will load balance over the content nodes (if you have multiple groups), but since you are sending the requests to the containers, you need to load balance over those.

This can be done by code you write in your client, by VIP, by another tier of nodes you host yourself such as e.g Nginx, or by a hosted load balancer such as AWS ELB.

Jo Kristian Bergum · Answer

You can debug the distributed query execution by adding &presentation.timing=true&trace.timestamps&tracelevel=5 to the search request, then you'll get a trace in the response where you can see how the query was dispatched and how long each node uses to match the query. See also Scaling Vespa https://docs.vespa.ai/en/performance/sizing-search.html

Vespa.ai exploit multiple instances to answer queries

Tags:

vespa

Oded

2 Answers

Jon

Jo Kristian Bergum

Recent Activity

Donate For Us

Vespa.ai exploit multiple instances to answer queries

Tags:

vespa

Oded

2 Answers

Jon

Jo Kristian Bergum

Related questions

Recent Activity

Donate For Us