Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB $group operation - memory usage optimisation

I need to perform a $group operation over my entire collection. This group stage is reaching the limit of 100MB RAM usage.

The $group stage has a limit of 100 megabytes of RAM. By default, if the stage exceeds this limit, $group will produce an error. However, to allow for the handling of large datasets, set the allowDiskUse option to true to enable $group operations to write to temporary files.

I'm not limited by the RAM but I couldn't find how to increase this memory usage limit. Does anyone know how to config this restriction?

Setting up allowDiskUse to true will solve the solution but I assume the whole operation will be much slower and I'd like to find a better solution.


{
    $group: {
        _id: {
            producer: "$producer",
            dataset:"$dataset",
            featureOfInterest:"$_id.featureOfInterest",
            observedProperty:"$_id.observedProperty"
        },
        documentId: {$push:"$documentId"}
    }
}

This $group operation is performed over entire complexe objects (producer and dataset). I understand that this operation is expensive since "It requires to scan the entire result set before yielding, and MongoDB will have to at least store a pointer or an index of each element in the groups." I'd rather $group on uniqueId fields for both of these object.

How could I $group object using unique ID and $project the whole object afterwards? I'd like to obtain the same result as the group operation above using the group operation below at the beginning of my aggregation pipeline :

{
    $group: {
        _id: {
            producer: "$producer.producerId",
            dataset:"$dataset.datasetId",
            featureOfInterest:"$_id.featureOfInterest",
            observedProperty:"$_id.observedProperty"
        },
        documentId: {$push:"$documentId"}
    }
}
like image 938
charlycou Avatar asked Oct 15 '25 19:10

charlycou


1 Answers

allowDiskUse

There is no option in MongoDB to increase memory usage more than 100mb in aggregations so in heavy pipeline so you have to set the flag to true.

However

You may be interested reading about MongoDB In-Memory Storage Engine

Example starting mongodb with in-memory storage engine in the command line

mongod --storageEngine inMemory --dbpath <path> --inMemorySizeGB <newSize>

More information in Mongodb docs

https://docs.mongodb.com/manual/core/inmemory/

Regarding the second question - I didn't get it. Please post example documents.

like image 133
right Avatar answered Oct 18 '25 12:10

right



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!