Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Relational Database - How to decide whether to store data or calculate data?

Let's say you have two tables in a database, one for golf players and one for golf holes, along with an API that has to return something like total fairways that player has hit in his/her life. Is it best practice to have the API look at every hole to calculate the number for fairways hit, or just store fairways hit directly in the players table? It seems as though storing this data in the players table is basically duplicating the data, as it already lives in each hole. But in order to calculate it, you need to go through every hole that player has played, every time.

More generally, is this simply a situation where you need to balance correct data design with performance?

I realize this may require a subjective answer (sorry if it does), but I don't know enough about database design to know if their is a definitive answer to situations like this.

like image 573
Brad Avatar asked Oct 31 '25 15:10

Brad


1 Answers

More generally, is this simply a situation where you need to balance correct data design with performance?

TL;DR "Yes".

The relational model has nothing to do with performance. The relational model of data is a formal theory; it's one of many data models.

A data model is an abstract, self-contained, logical definition of the objects, operators, and so forth, that together constitute the abstract machine with which users interact.[1]

An abstract machine has no performance problems, because it doesn't exist in a physical sense. That's why, for example, the relational model says nothing about indexes.

On the other hand, a SQL database has a lot do do with performance. A SQL database has a physical implementation whose performance is affected by the number of cores; amount of memory; disk space, configuration, and spindle speed; number of concurrent users; indexes; and so on.

The difference is the difference between logical and physical, between abstract and concrete, and between principle and practice.

So yes, you need to balance clean design with performance. Everybody does.

The best way to do that is to "do a clean logical (i.e. relational) design first, and then, as a separate and subsequent step, to map that logical design into whatever physical structures the target DMBS happens to support."[2]

If you have to store the result of calculations, best practice is to make the SQL DBMS maintain consistency. For example, if you have to store the result of (quantity * price) + sales_tax, write a CHECK() constraint to guarantee consistency. Some DBMS don't support CHECK() constraints.

If you have to maintain tallys (counts) across many rows, use a materialized view. Some DBMS don't support materialized views.

In the worst case, you can only use a report that a person reads to determine whether an inconsistency has crept in. The person takes corrective action.

In all cases, measure performance of representative INSERT, UPDATE, and DELETE statements before and after making changes.

Is it best practice to have the API look at every hole to calculate the number for fairways hit, or just store fairways hit directly in the players table?

There are many statistics. You should probably store statistics in one or more additional tables. SQL DBMS don't have to "look at every hole"; they operate on sets.

But in order to calculate it, you need to go through every hole that player has played, every time.

No, you don't need to "go through every hole", at least not in the sense of iterating through every hole, although that's exactly what a lot of front-end application frameworks do. You just need a single SQL query like select count(*) from player_holes where fairway_hit = True;.


[1] An Introduction to Database Systems, 7th ed, C. J. Date, p 14

[2] ibid, p 327.

like image 170
Mike Sherrill 'Cat Recall' Avatar answered Nov 03 '25 19:11

Mike Sherrill 'Cat Recall'



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!