To build an index on MongoDB objectIds, I use db.collection.ensureIndex({ _id: "hashed"})
.
This generates a Number type that the collection is indexed on. What is the algorithm that goes into this? Is it MD5? And how would I call it on ObjectID objects?
For example, I would like to do something like pull the top 10 _id hashes from a collection, and use that for processing.
Looking at the file hasher.cpp in the MongoDB sourcecode, the used hash function is indeed MD5.
However, MongoDB only uses these hashes internally. It does not expose them to the public API. A hashed index only allows you to query for exact values. It does so by taking your input, hashing it as it hashed the indexed field in the database, and searching for a collision in the hashtable which represents the index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With