I need to generate a simple frequency table in MongoDB. Let's say I have the following documents in a collection called books.
{
"_id": 1,
genre: [
"Fantasy",
"Crime",
"Drama"
]
}
{
"_id": 2,
genre: [
"Drama"
]
}
The expected output is: Fantasy: 1 Crime: 1 Drama: 2
Assuming that I don't have a preformed notion of what genres are possible, how can I get MongoDB to detect the genres and count them? This seems like a job for the aggregation framework.
So far, I've gotten to this stage:
db.books.aggregate([
{
$group: {
_id: null,
test: {$push: "$genre"}
}
}
])
This does not work as intended, as it just adds the genre array from each document. Can anyone point me in the right direction? Should I be using the aggregation framework?
Thank you!
When dealing with arrays in the aggregation framework what you generally need to do is use $unwind. This processes the array and essentially produces a new document for each array member. Think of it as "de-normalizing" with a SQL JOIN.
Once the array is "unwound" then you can process your $group:
db.collection.aggregate([
{ "$unwind": "$genre" },
{ "$group": {
"_id": "$genre",
"count": { "$sum": 1 }
}}
])
This gives output like:
{ "_id" : "Drama", "count" : 2 }
{ "_id" : "Crime", "count" : 1 }
{ "_id" : "Fantasy", "count" : 1 }
The _id provided to $group is the "key" that you want to group on, in your case this is "genre" referred to as it's "value" by prefixing with the $ "$genre". The $sum simply adds up values of 1 for each occurrence, which is generally how you count.
There are several examples shown at the SQL to Aggregation mapping chart, which is not only a useful guide for those used to SQL but had general practical examples of usage as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With