I have a MongoDB collection of people.
For each person document, I want to iterate over the rest of the documents in the collection and find a "match" for this person based on certain criteria. The problem is, if I iterate over the documents in the same standard order, the people later in the collection will rarely be assigned a match.
So I would like to randomly iterate over the collection. Is there a way to do this?
With MongoDB v4.4+, you can now use $rand to generate a random sort key so you can sort/iterate the collection randomly. You can put it in a $lookup sub-pipeline to match some other person and use the random sort key to break even.
db.people.aggregate([
{
"$match": {
"name": "alice"
}
},
{
"$lookup": {
"from": "people",
"let": {
i: "$interests",
id: "$_id"
},
"pipeline": [
{
"$addFields": {
"randSortKey": {
"$rand": {}
}
}
},
{
"$match": {
$expr: {
$and: [
// exclude the person himself/herself
{
$ne: [
"$_id",
"$$id"
]
},
// find with some shared interest
{
$ne: [
[],
{
"$setIntersection": [
"$interests",
"$$i"
]
}
]
}
]
}
}
},
{
"$sort": {
randSortKey: 1
}
},
{
$limit: 1
}
],
"as": "randomlyMatchedPeople"
}
}
])
Mongo Playground
The above could be a slow solution as it involves generating a random key for every documents in the collection for sorting. One of the workarounds is that you can precompute and materialize the rand key. Then indexing it.
db.collection.updateMany({},
[{
"$addFields": {
"randSortKey": {
"$rand": {}
}
}
}]);
db.collection.createIndex( { randSortKey: 1 } );
The downside of this approach is that it is kind of predetermining the sorting order and make the result not so random. You can periodically regenerate the randSortKey if needed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With