MongoDB One to Many Relationship

Tags:

mongodb

I’m starting to learn MongoDB and I at one moment I was asking myself how to solve the “one to many” relationship design in MongoDB. While searching, I found many comments in other posts/articles like ” you are thinking relational “. Ok, I agree. There will be some cases like duplication of information won’t be a problem, like in for example, CLIENTS-ORDERS example.

But, suppose you have the tables: ORDERS, that has an embedded DETAIL structure with the PRODUCTS that a client bought. So for one thing or another, you need to change a product name (or another kind of information) that is already embedded in several orders.

At the end, you are force to do a one-to-many relashionship in MongoDB (that means, putting the ObjectID field as link to another collection) so you can solve this simple problem, don’t you ? But every time I found some article/comment about this, it says that will be a performance fault in Mongo. It’s kind of disappointing

Is there another way to solve/design this without performance fault in MongoDB ?

456

asked Aug 25 '14 12:08

tapatio

2 Answers

The problem is that you over normalize your data. An order is defined by a customer, who lives at a certain place at the given point in time, pays a certain price valid at the time of the order (which might heavily change over the application lifetime and which you have to document anyway and several other parameters which are all valid only in a certain point of time. So to document an order (pun intended), you need to persist all data for that certain point in time. Let me give you an example:

{ _id: "order123456789",
  date: ISODate("2014-08-01T16:25:00.141Z"),
  customer: ObjectId("53fb38f0040980c9960ee270"),
  items:[ ObjectId("53fb3940040980c9960ee271"),
          ObjectId("53fb3940040980c9960ee272"),
          ObjectId("53fb3940040980c9960ee273")
         ],
 Total:400
 }

Now, as long as neither the customer nor the details of the items change, you are able to reproduce where this order was sent to, what the prices on the order were and alike. But now what happens if the customer changes it's address? Or if the price of an item changes? You would need to keep track of those changes in their respective documents. It would be much easier and sufficiently efficient to store the order like:

{
  _id: "order987654321",
  date: ISODate("2014-08-01T16:25:00.141Z"),
  customer: {
               userID: ObjectId("53fb3940040980c9960ee283"),
               recipientName: "Foo Bar"
               address: {
                          street: "742 Evergreen Terrace",
                          city: "Springfield",
                          state: null
                         }
            },
  items: [
    {count:1, productId:ObjectId("53fb3940040980c9960ee300"), price: 42.00 },
    {count:3, productId:ObjectId("53fb3940040980c9960ee301"), price: 0.99},
    {count:5, productId:ObjectId("53fb3940040980c9960ee302"), price: 199.00}
  ]
}

With this data model and the usage of aggregation pipelines, you have several advantages:

You don't need to independently keep track of prices and addresses or name changes or gift buys of a customer - it is already documented.
Using aggregation pipelines, you can create a price trends without the need of storing pricing data independently. You simply store the current price of an item in an order document.
Even complex aggregations such as price elasticity, turnover by state / city and alike can be done using pretty simple aggregations.

In general, it is safe to say that in a document oriented database, every property or field which is subject to change in the future and this change would create a different semantic meaning should be stored inside the document. Everything which is subject to change in the future but doesn't touch the semantic meaning (the users password in the example) may be linked via a GUID.

110

answered Oct 07 '22 17:10

Markus W Mahlberg

One to Many Relations

In this relationship, there is many, many entities or many entities that map to the one entity. e.g.: - a city have many persons who live in that city. Say NYC have 8 million people.

Let's assume the below data model:

 
  //city
  {
  _id: 1,
  name: 'NYC',
  area: 30,
  people: [{
      _id: 1,
      name: 'name',
      gender: 'gender'
        .....
    },
    ....
    8 million people data inside this array
    ....
  ]
}

This won't work because that's going to be REALLY HUGE. Let's try to flip the head.


 //people
 {
 _id: 1,
 name: 'John Doe',
 gender: gender,
 city: {
     _id: 1,
     name: 'NYC',
     area: '30'
       .....
   }
}

Now the problem with this design is that if there are obviously multiple people living in NYC, so we've done a lot of duplication for city data.

Probably, the best way to model this data is to use true linking.


 //people
 {
 _id: 1,
 name: 'John Doe',
 gender: gender,
 city: 'NYC'
}

//city
{
_id: 'NYC',
...
}

In this case, people collection can be linked to the city collection. Knowing we don't have foreign key constraints, we've to be consistent about it. So, this is a one to many relation. It requires 2 collections. For small one to few (which is also one to many), relations like blog post to comments. Comments can be embedded inside post documents as an array.

So, if it's truly one to many, 2 collections works best with linking. But for one to few, one single collection is generally enough.

answered Oct 07 '22 18:10

xameeramir

Related questions
                            
                                Reading, writing and storing JSON with Node on Heroku
                            
                                MongoDB queries with null value
                            
                                MongoDb : How to insert additional object into object collection?
                            
                                what is the default batchSize in pymongo?
                            
                                How to group date quarterly wise?
                            
                                oplog enable on standalone mongod not for replicaset
                            
                                Argument passed in must be a single String of 12 bytes
                            
                                Mongo Restart Error -- /var/run/mongodb/mongod.pid exists
                            
                                Mongoose $push keeps adding two entries
                            
                                Querying with Redis?
                            
                                Abstract layer for Node.js database
                            
                                pymongo method of getting statistics for collection byte usage?
                            
                                Mongodb Convention packs
                            
                                MongoDB .NET not generating _id on upsert
                            
                                Sanitize user input in Mongoose
                            
                                ZonedDateTime with MongoDB
                            
                                Real-time web apps with mongodb and nodejs
                            
                                How does mongoose populate work under the hood
                            
                                Handling MongoDB's ISODate() when attempting to parse a serialized JSON string
                            
                                Mongo tailable cursors vs Redis pub/sub

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With