Can any one explain me what is the difference between FSDirectory and MMapDirectory? I want to warm up my cache. I read that this could be useful but couldn't find how this will be helpful in warming up the cache. Explain me if you have any idea. Even pointers are welcome.
Lucene documentation says that MMap uses virtual memory to speed up the lookup of the indices.
How the speedp up is achieved and what happens if my indices are large so that they won't fit in my virtual memory>
MMapDirectory is one of the concrete subclasses of the abstract FSDirectory class. It uses memory-mapped files to access the information in the index.
The other options are SimpleFSDirectory and NIOFSDirectory, which use different methods. You should have a look at the documentation for FSDirectory for a brief explanation of all three. As you will see there, FSDirectory.open(File) tries to pick the best implementation for your environment.
In my own experience, I haven't noted any significant difference in performance between NIOFSDirectory and MMapFSDirectory, but you should do some performance testing using your own data and hardware setup.
In case you end up using MMapFSDirectory, virtual memory and index size could only be a problem on a 32-bit machine (unless your indexes are larger than 2^48 bit = 32TB).
If your indexes won't fit in virtual memory, you are likely to be better off using FSDirectory. The problem is that using MMapDirectory when it won't fit in virtual memory is equivalent to using FSDirectory and using the OS's caching algorithm (the OS's caching algorithm is likely to be better than what you can hand-code). ('Equivalent' because in both cases, only parts of the index will be in physical memory at once.)
But as 'martin' said above, you need to do some performance testing of your own.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With