In MySQL, you can create an index to be used in a query to prevent full tables scans. Only one index can be used.
Furthermore, in order to use the index, the field(s) indexed cannot be run through a function (i.e. DATE(), MONTH(), YEAR()) because then the query optimizer won't know what the result will be so can't use the index and will fall back to a full (or partial) table scan instead.
Assuming you wanted to run a report that grouped entries by day/month/quarter/year (GROUP BY date(created_at)) how could you designed a query that would do this while still using an index?
Example Table:
CREATE TABLE `datesort` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`value` int(11) NOT NULL,
`created_at` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `c_v` (`created_at`,`value`)
) ENGINE=InnoDB;
-- Problem Query
EXPLAIN SELECT COUNT(*), `value`, created_at
FROM datesort
WHERE created_at > NOW() - INTERVAL 1 DAY
GROUP BY date(created_at), value;
-- Using where; Using index; Using temporary; Using filesort
vs
EXPLAIN SELECT COUNT(*), `value`, created_at
FROM datesort
WHERE created_at > NOW() - INTERVAL 1 DAY
GROUP BY created_at, value;
-- Using where; Using index
-- (notice no DATE() in GROUP BY)
Notice the first query has to result to a partial table scan (Using temporary; Using filesort) because it can't use the c_v index because of the DATE(created_at).
The second query doesn't sort by date (it sorts by seconds) but can use the index alone without resulting to reading the record data.
Since grouping by time periods is pretty common with reports, how can I group records by day/month/quarter/year using just the index?
Extending on helpful comments by WOUNDEDStevenJones and Rick James: you could create a generated column that stores the date part of each record (without the time component) and index it.
alter table datesort
add column date_created_at date
generated always as (date(created_at)) stored
;
create index myidx on datesort(date_created_at, value);
Now you can try your query again. To get the full benefit of the index, you would ideally need to change the where clause so it uses the generated date column rather than the original datetime column (hopefully, this still fits your use case):
select count(*) cnt, value, date_created_at
from datesort
where date_created_at > current_date - interval 1 day
group by date_created_at, value;
This produces the expected explain:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra -: | :---------- | :------- | :--------- | :---- | :------------ | :---- | :------ | :--- | ---: | -------: | :----------------------- 1 | SIMPLE | datesort | null | index | myidx | myidx | 8 | null | 1 | 100.00 | Using where; Using index
Demo on DB Fiddle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With