Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LevelDB for 100s of millions entries

Tags:

python

leveldb

What are the top factors to consider when tuning inserts for a LevelDB store?

I'm inserting 500M+ records in the form:

  1. key="rs1234576543" very predictable structure. rs<1+ digits>
  2. value="1,20000,A,C" string can be much longer but usually ~ 40 chars
  3. keys are unique
  4. key insert order is random

into a LevelDB store using the python plyvel, and see dramatic drop in speed as the number of records grows. I guess this is expected but are there tuning measures I could look at to make it scale better?

Example code:

import plyvel
BATCHSIZE = 1000000

db = plyvel.DB('/tmp/lvldbSNP151/', create_if_missing=True)
wb = db.write_batch()
# items not in any key order
for key, value in DBSNPfile:
    wb.put(key,value)
    if i%BATCHSIZE==0:
        wb.write()
wb.write()

I've tried various batch sizes, which helps bit, but am hoping there's something else I've missed. For example, can knowing the max length of a key (or value) be leveraged?

like image 613
pufferfish Avatar asked Oct 20 '25 09:10

pufferfish


1 Answers

(Plyvel author here.)

LevelDB keeps all database items in sorted order. Since you are writing in a random order, this basically means that all parts of the database get rewritten all the time since LevelDB has to merge SSTs (this happens in the background). Once your database gets larger, and you keep adding more items to it, this results in a reduced write throughput.

I suspect that performance will not degrade as badly if you have better locality of your writes.

Other ideas that may be worth trying out are:

  • increase the write_buffer_size
  • increase the max_file_size
  • experiment with a larger block_size
  • use .write_batch(sync=False)

The above can all be used from Python using extra keyword arguments to plyvel.DB and to the .write_batch() method. See the api docs for details.

like image 121
wouter bolsterlee Avatar answered Oct 21 '25 23:10

wouter bolsterlee



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!