I'm currently trying to model an aspect of a system whereby components that are stored can change state, eg OK, FAILED, REPAIRED etc. On the web side I will need to show the current state, but also the history of previous (if any) states.
I'm torn between these two designs, could anyone shed any light on the best way (I'm more a software dev than a dba guy).
Option one:
statehistory table which tracks each time the state changes, the highest sequence number will be the current state : SQLFiddle example
Option two: Similar to above, except the current state is stored in the component table, and only past states are in the history table. When state is changed the current state is inserted as the most recent in history then the current is set in the component table: SQLFiddle example
As an aside, use either one or two but without the state lookup table, just store the state text as varchar (my thinking is this makes it easier to report from?): SQLFiddle example
Thanks.
EDIT:
There are several component tables, should the state history table contain the data for all of them, or make a statehistory table per component? Each components table will have hundreds of thousands of entries, making the statehistory table pretty large.
eg: Table: component_a Table: component_b etc.. statehistory ( component_a_id, component_b_id, state_id, ... )
I tend to do a hybrid between the two. I always store all state changes including the current state in the history table. That gives you a central place to query them. You can have a column IsCurrent BIT NOT NULL
to make your life a little easier. Create a filtered unique index with filter IsCurrent = 1
to enforce basic integrity rules.
I also store the current state in the main table. Probably not just a copy but as a foreign key to the history table. That makes for very convenient querying. Looking up the current state is often useful. For indexing reasons you can also duplicate the values into the main table, of course. The more duplication you have the more error prone the system.
If you want to avoid duplication but still index on the current status, you can create an indexed view to combine main and history tables. You can then create an index on mixed columns from both tables (e.g. on (StatusHistoryItems.Status
, Components.Name
) to support queries that ask for customers with a specific status and a specific name. This query would be resolved as a single index seek on the view's index).
You'd create a view like this:
SELECT *
FROM Components c
JOIN StatusHistoryItems shi on c.ID = shi.ComponentID
AND c.IsCurrent = 1 --this condition will join exactly one row
And index it. Now you have the current status together with all component data in one efficient index. No duplication, no denormalization at all. Just make sure that there is at least one status row for each customer with IsCurrent = 1
.
I recommend having a nightly validation job that validates data consistency and alerts you of problems. Denormalized data has a habit of becoming corrupted over time for various reasons.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With