I'm making parental reference tree with MongoDB and Mongoose. My schema looks like this
var NodesSchema = new Schema({
_id: {
type: ShortId,
len: 7
},
name: { // name of the file or folder
type: String,
required: true
},
isFile: { // is the node file or folder
type: Boolean,
required: true
},
location: { // location, null for root
type: ShortId,
default: null
},
data: { // optional if isFile is true
type: String
}
});
Note that files/folders are rename-able.
In my current setup if I want to get files in specific folder I perform the following query:
NodesModel.find({ location: 'LOCATION_ID' })
If I want to get a single file/folder I run:
NodesModel.findOne({ _id: 'ITEM_ID' })
and the location field looks like f8mNslZ1 but if I want to get the location folder name I need to do second query.
Unfortunately if I want to get path to root I need to do a recursive query, which might be slow if I have 300 nested folders.
So I have been searching and figured out the following possible solution:
Should I change the location field from string to object and save the information in it as following:
location: {
_id: 'LOCATION_ID',
name: 'LOCATION_NAME',
fullpath: '/FOLDERNAME1/FOLDERNAME2'
}
The problem in this solution is that files/folders are rename-able. On rename I should update all children. However rename occurs much more rarely then indexing, but if the folder has 1000 items, would be a problem I guess.
My questions are:
Looking at your Node Schema, if you change the location property to an object, you'll have 2 places where you state the Node's name so be mindful of updating both name properties. Usually you want to keep you database as DRY as possible, and in most cases doing nested queries is quite common. That being said, you know your database much more than I do, and if you see a significant performance delay by doing more queries, then just be sure to update all name properties.
In addition to this, if you have your location's fullpath property be a string, and let's say you run into a case where you have to rename a folder, you'll have to analyze the whole string by breaking it down and comparing substrings to a new value for the new folder name. This can get tedious.
A possible solution could be to store the full path as an array instead of a string, having the order be the next folder in the chain, that way you can quickly compare and update when need be.
The different ways to model tree structures are extensively covered in the MongoDB docs.
The way you are proposing is one of them.
Depending on how frequent folder renaming is expected to happen (and/or any other hierarchy changes more complex than adding a new leaf node) you might consider storing the "path" as an "array of ancestors" instead. But whichever way you happen to denormalize or materialize the path up the tree in each folder, the trade-off is that for faster look-ups, you will have slower and/or more complicated updates.
In your case it seems clear to optimize for the read and not for the rare update - in addition to being less frequent, it seems that renames could be done asynchronously where that's simply not possible with displaying names of parent folders.
While DRY is a great principle in programming, it's pretty much not applicable to non-relational databases, so unless you are using a strictly relational database and normal form don't apply it to your schema design and in fact this would be specifically discouraged in MongoDB as you would then be using the wrong tool for the job.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With