Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to optimize schema for capturing attendance data

We have a sports training camp which is regularly attended by various teams in the city. We have a session per day spanning 2 hrs(9-11 AM) and the time slots could vary for different teams. We would like to capture who attended the training camp on a daily basis.

We arrived at the following model to capture attendance. (id, user_id, date, present). Assuming the user attends camp daily (say 30 days in a month), you will see that many records in the database.

Assuming we are interested only in finding out the number of days the user has attended the camp, is there a better way to mark presence or absence of a particular user (maybe just have a single row for a month and mark all the individual days as something like (P,P,P,A, ...,A,P). P = Present, A = Absent

like image 448
Sam Avatar asked Dec 05 '25 00:12

Sam


1 Answers

You use the word "optimize" in the question title without explaining what it is you want to optimize.

If you're talking about query performance then you do not have a problem. The number of records you can have is governed by the number of sessions you have each day (because only one team can attend any given session). If you run ten sessions a day that's three hundred records per month. If you run one hundred sessions a day that is three thousand records a month. These are not big volumes of data. So you are making a bad decision by skewing your database design to avoid a performance problem which isn't there.

You mentioned spreadsheets in one of your comments. That is not a bad design to have. Along the top row there are sessions, down the side there are teams, and the cells show whether a team was present at a session. Those map to three database tables: SESSIONS, TEAMS and the intersection table TEAM_SESSIONS. You only need a record in TEAM_SESSIONS when a team attended a session.

As a proof of concept I knocked up three tables in Oracle.

SQL> desc teams
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER
 NAME                                               VARCHAR2(20 CHAR)

SQL> desc sessions
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER
 SSN_DAY                                            DATE
 SSN_START                                          NUMBER(4,2)
 SSN_END                                            NUMBER(4,2)

SQL> desc team_sessions
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 TEAM_ID                                   NOT NULL NUMBER
 SESSION_ID                                NOT NULL NUMBER

SQL>

The PIVOT function introduced in Oracle 11g makes it a cinch to knock up a matrix (different flavours of DBMS will have different ways to approach this). As you can see, three teams have booked sessions today, nobody wants to train at lunchtime, and Bec United are keen as mustard (or need the training)!

SQL> select * from (
  2      select t.name as team_name
  3             , trim(to_char(s.ssn_start))||'-'||trim(to_char(s.ssn_end)) as ssn
  4             , case when ts.team_id is not null then 1 else 0 end as present
  5      from   sessions s
  6             cross join teams t
  7             left outer join team_sessions ts
  8                  on (ts.team_id = t.id
  9                      and ts.session_id = s.id )
 10      where s.ssn_day = trunc(sysdate)
 11      )
 12  pivot
 13      ( sum (present)
 14        for ssn in ( '9-11', '11-13', '13-15', '15-17', '17-19')
 15      )
 16  order by team_name
 17  /

TEAM_NAME                '9-11'    '11-13'    '13-15'    '15-17'    '17-19'
-------------------- ---------- ---------- ---------- ---------- ----------
Balham Blazers                0          1          0          0          0
Bec United                    1          0          0          0          1
Dinamo Tooting                0          0          0          0          0
Melchester Rovers             0          0          0          1          0

SQL>

Anyway, the virtue of this data model is that it is flexible. We can count how often a team attends, what times they attend, what day of the week they attend, what sessions are always booked, what sessions are rarely booked, etc. Plus it is easy to manage the data. In particular, the advantage of the three table solution over just two tables is that it is easier to prevent double bookings and non-standard or overlapping time slots.

You see, normalisation isn't just some moon language we use to bamboozle the innocent, it offers real practical benefits. There are few scenarios where driving down to at least BCNF is not the best idea.

like image 62
APC Avatar answered Dec 09 '25 02:12

APC



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!