I'm not mysql guy, but I've got opportunity to work with that recently - I've got to optimize some query on Mysql5.7 (I have to cover that also on 5.6, but started with 5.7 as it has apparently more info in explain) running on AWS Aurora. It takes a lot of time, some joins are involved, etc. I started with cutting branches and choose to start 'debugging' with two tables only. These are not too big (~2M and ~1.5M rows), but well, generally I consider them as not too greatly designed (primary keys on varchar(255) columns, etc.).
The thing is that I wanted to take a look into so I used explain format=json and I'm trying to get any insight from that.
So, let's say that for
select cc.id, cc.col1, cc.col2, ct.col1
from my_table cc
inner join my_table ct on ct.cc_id = cc.id
I am getting something like
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "5630369.56"
},
"nested_loop": [
{
"table": {
"table_name": "cc",
"access_type": "index",
"possible_keys": [
"PRIMARY"
],
"key_length": "258",
"rows_examined_per_scan": 1248725,
"rows_produced_per_join": 1248725,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "3732979.00",
"eval_cost": "249745.00",
"prefix_cost": "3982724.00",
"data_read_per_join": "1G"
}
}
},
{
"table": {
"table_name": "ct",
"access_type": "ref",
"possible_keys": [
"cc_id_idx"
],
"key": "cc_id_idx",
"key_length": "257",
"ref": [
"cc.id"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1373037,
"filtered": "100.00",
"cost_info": {
"read_cost": "1373037.97",
"eval_cost": "274607.59",
"prefix_cost": "5630369.56",
"data_read_per_join": "3G"
},
"used_columns": [
"id",
"col_1",
"col_2",
...
]
}
}
]
}
}
and I have problems with understanding what happens here exactly. I think that mysql makes nested join where it starts with table cc and then for each row it hits to ct (access_type is ref and "rows_examined_per_scan": 1). It makes it > 1M times ("rows_produced_per_join": 1373037). Is that correct? I was looking for any documentation for that, but I didn't found any specifing info about how read these values in context of joining - maybe just my google-fu is not strong enough. Could any give me any clue about that?
(As I'm looking for any options to speed it up, I wanted to force mysql to make a hash join, which is available on Aurora (I was trying on Aurora 2.09) in form of /*+ HASH_JOIN(cc) */, but it didn't affect the query plan in any case - but it's rather issue for another question.)
select cc.id, cc.col1, cc.col2, ct.col1
from my_table cc
inner join my_table ct on ct.cc_id = cc.id
If there were a WHERE clause, that would probably cause the Optimizer to pick the table mentioned in WHERE as the "first" table in the JOIN.
Without a WHERE, the Optimizer usually picks the smaller table. as "first".
Then it usually does a NLJ (Nested Loop Join):
WHERE) of the "first" table.That step 2 is sometimes done by reading the entire "second" table into memory and building a hash. But this only works for a "small" second table. Your table seems to be too big for that.
I bring all this up because the query you presented is simplified; the 'real' query may not be executed the same as that EXPLAIN.
Some observations:
VARCHAR(255) when some sensible length is practical. Mixing collations in a JOIN is deadly on performance. (The is no real proof that the query has this problem.)VARCHAR(255) for a PRIMARY KEY, I disagree with those who exclaim that it is terrible and that you "must" switch to an INT. Can you tell us what type of data is there? UUID vs short string vs URL vs ... -- different optimization techniques may be applicable depending on the data.INDEX being used. This is good for performance. But you could kill that if you add another column to the SELECT clause.EXPLAIN not match the query?The simple query, as written, will take a long time -- a scan of one table (or index), plus lots of BTree lookups into the other table.
Back to your original question. Here's what I see in the Explain:
cc_id is UNIQUE.Please provide SHOW CREATE TABLE for questions like this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With