I have a table 1 (MID, SSN, ...) MID is primary key and table 2 (ID, SSN, StateCode..) where ID and SSN make up the primary key. I'm trying to display all columns from table 1 along with StateCode from table 2 matching it against SSN. Tbl 1 has 50 rows and some have same SSN values.
If no SSN match is found from table 2, displaying a NULL in StateCode is acceptable, so I chose left join. Here is my query
Select
tbl1.*, tbl2.StateCode
from
tbl1
left outer join
tbl2 on tbl1.SSN = tbl2.SSN
I'm looking to retrieve 50 records, but I get 70, rows that contain the same ssn value in tbl1 ends up duplicated in the final output. What is going wrong?
I'd suggest reading on cartesian product.
If you have 50 rows in the first table and 70 in the second that makes 3500 rows. The join condition tbl1.SSN = tbl2.SSN will filter out rows but you may well end up with more than 50 rows.
Back to your problem you can see what is happening by trying the following :
SELECT
tbl1.*,
(SELECT COUNT(*) FROM tbl2 WHERE tbl1.SSN = tbl2.SSN) AS NbResultTbl2
FROM
tbl1
This will tell which rows of tbl1 has multiple match in tbl2. If you have a number higher than 1 in the NbResultTbl2 column then you are going to end up with duplicates.
To eliminate those duplicates you can try this :
SELECT
tbl1.*,
(SELECT TOP 1 StateCode FROM tbl2 WHERE tbl1.SSN = tbl2.SSN)
FROM
tbl1
This will get the first StateCode found for a matching SNN in tbl2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With