Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Insert query optimization

I have a table contains ~2M rows. It has B-tree primary key on (id,round), and 2 other (seems irrelevant) indexes.

My questions is:

insert into a1
select * from cars
where (id, round) in (select id, min(round) from cars group by id)

The above query will take ~30s to run but why the following query

insert into a1
select * from cars
where (id, round) in (select id, max(round) from cars group by id)

seems to take forever to run? I've waited hours. The only thing I changed is min->max

PS: here's the create table

CREATE TABLE `cars` (
  `id` int(11) NOT NULL,
  `make` varchar(128) NOT NULL,
  `miles` varchar(128) NOT NULL,
  `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `round` int(11) NOT NULL,
  `location` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`id`,`round`),
  KEY `time` (`time`),
  KEY `make` (`make`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
like image 865
Tao Song Avatar asked Jun 06 '26 11:06

Tao Song


1 Answers

IN subqueries are not optimized in the best way by current MySQL query optimizer.

I would use:

INSERT INTO a1
  SELECT c.* 
  FROM 
        cars AS c
    JOIN 
        ( SELECT id, MAX(round) AS round
          FROM cars 
          GROUP BY id
        ) AS m
      ON  (m.id, m.round) = (c.id, c.round) 

The index you already have on (id, round) will make the subquery run pretty fast and only once (as a derived table). The other way, with the IN, the subquery is run once for every row of the cars table (that's why it's slow). Check the execution plans with EXPLAIN.

like image 196
ypercubeᵀᴹ Avatar answered Jun 08 '26 23:06

ypercubeᵀᴹ