I currently have a database with two tables called Articles and Tags . In order to allow articles to be in multiple categories i have a many to many relationship. Is it a mistake to have such a design in terms of performance? or should i remove the relationship between these two table and add a third table as a bridge (articlesTags)?
A many-to-many relationship occurs when multiple records in one table are related to multiple records in another table. For example, products and suppliers: one supplier may deliver one or many products and at the same time, the company may order one product from one or many suppliers.
Junction table. When you need to establish a many-to-many relationship between two or more tables, the simplest way is to use a Junction Table. A Junction table in a database, also referred to as a Bridge table or Associative Table, bridges the tables together by referencing the primary keys of each data table.
Many-to-many (m:n) relationships add complexity and confusion to your model and to the application development process. The key to resolve m:n relationships is to separate the two entities and create two one-to-many (1:n) relationships between them with a third intersect entity.
Your example is a many-to-many relationship: a customer can order many food items, and the same food items can be ordered by many customers. If you model it as a one-to-many relationship, you are either saying that the customer can order only one item, or that the item can be ordered by one customer only.
There's nothing inherently wrong with having a many-to-many relationship, you'll just need to create a Junction Table (which is what it sounds like you're referring to with articlesTags) to facilitate that relationship.
You're seeing the difference between a conceptual database design (the N:N relationship) and it's physical realisation. No matter how you model your N:N relationship, you'll need the aforementioned Junction Table to make it work.
There's nothing wrong with modeling a real world relationship as close to the real world as you can as a general statement. Clarity is king.
When it comes to any performance question in any system the answer usually boils down to "it depends".
If your performance problem is related to WRITES then a highly NORMALISED structure is best and you'll want that Junction table. You'll end up writing a lot less data and that can speed things up substantially (though you may burn that advantage by having to do lookups before you create the inserts). Reading from the individual normalised tables can be very fast too.
If your problem is related to analytical READS then a DENORMALISED structure is best. Joins can be very performance intensive if the tables are large and the indices spread out. You'll sacrifice a lot of space to gain a lot of time.
In general, you want to look at the specifics of your situation and weigh the pros and cons of each approach before deciding on a solution. Personally, I've always found it better to focus on Clarity in the initial stages and refactor for performance if I discover a problem later.
A many-to-many relationship exists in a relationnal model, it's just an abstraction of the mind. When you'll implement it there will be a articles_to_tags table where you'll have :
fk_article(INTEGER) fk_tag(INTEGER)
cf http://en.wikipedia.org/wiki/Many-to-many_(data_model)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With