I have a MySQL query which joins between the two table. I need to map call id from first table with second table. Second table may not have the call id, hence I need to left join the tables. Below is the query, it takes around 125 seconds to finish.
select uniqueid, TRANTAB.DISP, TRANTAB.DIAL FROM
closer_log LEFT JOIN
(select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
        from agent_transition_log  group by call_uniqueId) TRANTAB
on closer_log.uniqueid=TRANTAB.call_uniqueId;
Here is the explain output of the query with left join.
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
| id | select_type | table                | type  | possible_keys | key                        | key_len | ref  | rows   | Extra       |
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
|  1 | PRIMARY     | closer_log           | index | NULL          | uniqueid                   | 43      | NULL |  37409 | Using index |
|  1 | PRIMARY     | <derived2>           | ALL   | NULL          | NULL                       | NULL    | NULL |  32535 |             |
|  2 | DERIVED     | agent_transition_log | index | NULL          | index_agent_transition_log | 43      | NULL | 159406 |             |
+----+-------------+----------------------+-------+---------------+----------------------------+---------+------+--------+-------------+
If I do the internal join, then execution time is around 2 seconds.
select uniqueid, TRANTAB.DISP, TRANTAB.DIAL FROM
closer_log JOIN
(select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
        from agent_transition_log  group by call_uniqueId) TRANTAB
on closer_log.uniqueid=TRANTAB.call_uniqueId;
Explain output of query with internal join.
    +----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
| id | select_type | table                | type  | possible_keys                      | key                        | key_len | ref                   | rows   | Extra                    |
+----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
|  1 | PRIMARY     | <derived2>           | ALL   | NULL                               | NULL                       | NULL    | NULL                  |  32535 |                          |
|  1 | PRIMARY     | closer_log  | ref   | uniqueid,index_closer_log | index_closer_log  | 43      | TRANTAB.call_uniqueId |      1 | Using where; Using index |
|  2 | DERIVED     | agent_transition_log | index | NULL                               | index_agent_transition_log | 43      | NULL                  | 159406 |                          |
+----+-------------+----------------------+-------+------------------------------------+----------------------------+---------+-----------------------+--------+--------------------------+
My question is, why is internal join so much faster then left join. Does my query has any logical fault which is causing the slow execution? What are my optimization options. The call ids in both the tables are indexed.
Edit 1) Added table descriptions
mysql> desc agent_transition_log;
+--------------------+----------------------+------+-----+---------+-------+
| Field              | Type                 | Null | Key | Default | Extra |
+--------------------+----------------------+------+-----+---------+-------+
| user_log_id        | int(9) unsigned      | NO   | MUL | NULL    |       |
| event_time         | datetime             | YES  |     | NULL    |       |
| dispoStatus        | varchar(6)           | YES  |     | NULL    |       |
| call_uniqueId      | varchar(40)          | YES  | MUL | NULL    |       |
| xfer_call_uid      | varchar(40)          | YES  |     | NULL    |       |
| pause_duration     | smallint(5) unsigned | YES  |     | 0       |       |
| wait_duration      | smallint(5) unsigned | YES  |     | 0       |       |
| dialing_duration   | smallint(5) unsigned | YES  |     | 0       |       |
| ring_wait_duration | smallint(5) unsigned | YES  |     | 0       |       |
| talk_duration      | smallint(5) unsigned | YES  |     | 0       |       |
| dispo_duration     | smallint(5) unsigned | YES  |     | 0       |       |
| park_duration      | smallint(5) unsigned | YES  |     | 0       |       |
| rec_duration       | smallint(5) unsigned | YES  |     | 0       |       |
| xfer_wait_duration | smallint(5) unsigned | YES  |     | 0       |       |
| logged_in_duration | smallint(5) unsigned | YES  |     | 0       |       |
| sub_status         | varchar(6)           | YES  |     | NULL    |       |
+--------------------+----------------------+------+-----+---------+-------+
16 rows in set (0.00 sec)
mysql> desc closer_log;
+----------------+----------------------+------+-----+---------+----------------+
| Field          | Type                 | Null | Key | Default | Extra          |
+----------------+----------------------+------+-----+---------+----------------+
| closecallid    | int(9) unsigned      | NO   | PRI | NULL    | auto_increment |
| lead_id        | int(9) unsigned      | NO   | MUL | NULL    |                |
| list_id        | bigint(14) unsigned  | YES  |     | NULL    |                |
| campaign_id    | varchar(20)          | YES  | MUL | NULL    |                |
| call_date      | datetime             | YES  | MUL | NULL    |                |
| start_epoch    | int(10) unsigned     | YES  |     | NULL    |                |
| end_epoch      | int(10) unsigned     | YES  |     | NULL    |                |
| length_in_sec  | int(10)              | YES  |     | NULL    |                |
| status         | varchar(6)           | YES  |     | NULL    |                |
| phone_code     | varchar(10)          | YES  |     | NULL    |                |
| phone_number   | varchar(18)          | YES  | MUL | NULL    |                |
| user           | varchar(20)          | YES  |     | NULL    |                |
| comments       | varchar(255)         | YES  |     | NULL    |                |
| processed      | enum('Y','N')        | YES  |     | NULL    |                |
| queue_seconds  | decimal(7,2)         | YES  |     | 0.00    |                |
| user_group     | varchar(20)          | YES  |     | NULL    |                |
| xfercallid     | int(9) unsigned      | YES  |     | NULL    |                |
| uniqueid       | varchar(40)          | YES  | MUL | NULL    |                |
| callerid       | varchar(40)          | YES  |     | NULL    |                |
| agent_only     | varchar(20)          | YES  |     |         |                |
| queue_position | smallint(4) unsigned | YES  |     | 1       |                |
| root_uid       | varchar(40)          | YES  |     | NULL    |                |
| parent_uid     | varchar(40)          | YES  |     | NULL    |                |
| extension      | varchar(100)         | YES  |     | NULL    |                |
| alt_dial       | varchar(6)           | YES  |     | NULL    |                |
| talk_duration  | smallint(5) unsigned | YES  |     | 0       |                |
| did_pattern    | varchar(50)          | YES  |     | NULL    |                |
+----------------+----------------------+------+-----+---------+----------------+
Left join looks for the fields from left + unmatched entries from right, so it has to check every joined field in the right table which might be NULL (if you don't have an index on the fields for that JOIN, it means the query will check the whole right table every time). Inner join looks only for direct matches, so it might not have to go over the whole table to perform a join (Especially if you join on indexed fields).
By the way, if you only want to display the entries mentioned in agent_transition_log, you don't need join at all:
select call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log  group by call_uniqueId;
will do the job.
OR if you do want to add the missing entries:
SELECT call_uniqueId, sum(dispo_duration) as DISP, sum(dialing_duration) as DIAL
from agent_transition_log  group by call_uniqueId
UNION
SELECT uniqueid as call_uniqueid, NULL as DISP, NULL as DIAL from closer_log
WHERE uniqueid not in (SELECT call_uniqueid FROM agent_transition_log);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With