I'm having trouble understanding LEFT JOIN results in Snowflake.
select * from
(select 1 as AccountIdA) s
LEFT OUTER JOIN (Select 1 as AccountIdB, 'Val1' as Col1 union select 1, 'Val2' as Col1) t1 on t1.AccountIdB = s.AccountIdA
LEFT OUTER JOIN (Select 1 as AccountIdC, 'Val3' as Col2) t2 on t2.AccountIdC = s.AccountIdA
The SQL above results in the following:
| AccountIdA | AccountIdB | Col1 | AccountIdC | Col2 |
|---|---|---|---|---|
| 1 | 1 | VAL1 | 1 | VAL3 |
| 1 | 1 | VAL2 | 1 | VAL3 |
However, I am expecting the following, because there is only one row match from the second left join.
| AccountIdA | AccountIdB | Col1 | AccountIdC | Col2 |
|---|---|---|---|---|
| 1 | 1 | VAL1 | 1 | VAL3 |
| 1 | 1 | VAL2 | null | null |
Is there any logic to this? Any way to get the result that I'm expecting?
so just reformed that SQL so is was less wide, and just to make sense of it.
select * from (
select 1 as AccountIdA
) s
LEFT OUTER JOIN (
SELECT * FROM VALUES (1,'val1'), (1, 'Val2') v(AccountIdB, col1)
) t1 on t1.AccountIdB = s.AccountIdA
LEFT OUTER JOIN (
SELECT * FROM VALUES (1,'val3') v(AccountIdC, col2)
) t2 on t2.AccountIdC = s.AccountIdA;
But you get two rows, because of standard SQL logic. The joins happen is serial, thus you first have s with a single AccountIdA of 1.
| AccountIdA |
|---|
| 1 |
Then we join to t1 and we have
| AccountIdA | AccountIdB | col1 |
|---|---|---|
| 1 | 1 | 'val1' |
| 1 | 1 | 'val2' |
no surprises here:
now we join to t2 on where given we have 2 rows with AccountIdA we match the AccountIdC against both of those, and given they are both 1 we match both rows.
| AccountIdA | AccountIdB | col1 | AccountIdC | col2 |
|---|---|---|---|---|
| 1 | 1 | 'val1' | 1 | 'val3' |
| 1 | 1 | 'val2' | 1 | 'val3' |
now what a LEFT JOIN gives you is if the left side is not matched by the right side we get NULL, this can swapping some things around like:
select * from (
select column1 as AccountIdA FROM VALUES (1),(2),(3)
) s
LEFT OUTER JOIN (
SELECT * FROM VALUES (1,'val1'), (2, 'Val2') v(AccountIdB, col1)
) t1 on t1.AccountIdB = s.AccountIdA
LEFT OUTER JOIN (
SELECT * FROM VALUES (3,'val3') v(AccountIdC, col2)
) t2 on t2.AccountIdC = s.AccountIdA;
now we get matches on the keys that make sense, and no matches on the others.
| ACCOUNTIDA | ACCOUNTIDB | COL1 | ACCOUNTIDC | COL2 |
|---|---|---|---|---|
| 1 | 1 | val1 | ||
| 2 | 2 | Val2 | ||
| 3 | 3 | val3 |
So this goes to show when doing JOINs you have to make sure you have de-duplicated data, otherwise you can get multiplication of the rows
select * from (
select column1 as AccountIdA FROM VALUES (1),(1)
) s
LEFT OUTER JOIN (
SELECT * FROM VALUES (1,'val1'), (1, 'Val2') v(AccountIdB, col1)
) t1 on t1.AccountIdB = s.AccountIdA
LEFT OUTER JOIN (
SELECT * FROM VALUES (1,'val3'),(1,'val4') v(AccountIdC, col2)
) t2 on t2.AccountIdC = s.AccountIdA;
gives an explosion to:
| ACCOUNTIDA | ACCOUNTIDB | COL1 | ACCOUNTIDC | COL2 |
|---|---|---|---|---|
| 1 | 1 | val1 | 1 | val3 |
| 1 | 1 | val1 | 1 | val4 |
| 1 | 1 | Val2 | 1 | val3 |
| 1 | 1 | Val2 | 1 | val4 |
| 1 | 1 | val1 | 1 | val3 |
| 1 | 1 | val1 | 1 | val4 |
| 1 | 1 | Val2 | 1 | val3 |
| 1 | 1 | Val2 | 1 | val4 |
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With