Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is sorting in arangodb slow?

Tags:

arangodb

I am experimenting to see whether arangodb might be suitable for our usecase. We will have large collections of documents with the same schema (like an sql table).

To try some queries I have inserted about 90K documents, which is low, as we expect document counts in the order of 1 million of more.

Now I want to get a simple page of these documents, without filtering, but with descending sorting.

So my aql is:

for a in test_collection
sort a.ARTICLE_INTERNALNR desc
limit 0,10
return {'nr': a.ARTICLE_INTERNALNR}

When I run this in the AQL Editor, it takes about 7 seconds, while I would expect a couple of milliseconds or something like that.

I have tried creating a hash index and a skiplist index on it, but that didn't have any effect:

 db.test_collection.getIndexes()
[ 
  { 
    "id" : "test_collection/0", 
    "type" : "primary", 
    "unique" : true, 
    "fields" : [ 
      "_id" 
    ] 
  }, 
  { 
    "id" : "test_collection/19812564965", 
    "type" : "hash", 
    "unique" : true, 
    "fields" : [ 
      "ARTICLE_INTERNALNR" 
    ] 
  }, 
  { 
    "id" : "test_collection/19826720741", 
    "type" : "skiplist", 
    "unique" : false, 
    "fields" : [ 
      "ARTICLE_INTERNALNR" 
    ] 
  } 
]

So, am I missing something, or is ArangoDB not suitable for these cases?

like image 761
Wouter Avatar asked Dec 04 '25 14:12

Wouter


1 Answers

If ArangoDB needs to sort all the documents, this will be a relatively slow operation (compared to not sorting). So the goal is to avoid the sorting at all. ArangoDB has a skiplist index, which keeps indexed values in sorted order, and if that can be used in a query, it will speed up the query.

There are a few gotchas at the moment:

  1. AQL queries without a FILTER condition won't use an index.
  2. the skiplist index is fine for forward-order traversals, but it has no backward-order traversal facility.

Both these issues seem to have affected you. We hope to fix both issues as soon as possible.

At the moment there is a workaround to enforce using the index in forward-order using an AQL query as follows:

FOR a IN 
  SKIPLIST(test_collection, { ARTICLE_INTERNALNR: [ [ '>', 0 ] ] }, 0, 10) 
RETURN { nr: a.ARTICLE_INTERNALNR }

The above picks up the first 10 documents via the index on ARTICLE_INTERNALNR with a condition "value > 0". I am not sure if there is a solution for sorting backwards with limit.

like image 123
stj Avatar answered Dec 07 '25 22:12

stj



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!