I have a table contains ~2M rows. It has B-tree primary key on (id,round), and 2 other (seems irrelevant) indexes.
My questions is:
insert into a1
select * from cars
where (id, round) in (select id, min(round) from cars group by id)
The above query will take ~30s to run but why the following query
insert into a1
select * from cars
where (id, round) in (select id, max(round) from cars group by id)
seems to take forever to run? I've waited hours. The only thing I changed is min->max
PS: here's the create table
CREATE TABLE `cars` (
`id` int(11) NOT NULL,
`make` varchar(128) NOT NULL,
`miles` varchar(128) NOT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`round` int(11) NOT NULL,
`location` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`,`round`),
KEY `time` (`time`),
KEY `make` (`make`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
IN subqueries are not optimized in the best way by current MySQL query optimizer.
I would use:
INSERT INTO a1
SELECT c.*
FROM
cars AS c
JOIN
( SELECT id, MAX(round) AS round
FROM cars
GROUP BY id
) AS m
ON (m.id, m.round) = (c.id, c.round)
The index you already have on (id, round) will make the subquery run pretty fast and only once (as a derived table). The other way, with the IN, the subquery is run once for every row of the cars table (that's why it's slow). Check the execution plans with EXPLAIN.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With