I am having a tough time to understand why does LEFT JOIN / IS NULL eliminate records which are there in one table and not in the other.
Here is an example
SELECT  l.id, l.value
FROM    t_left l
LEFT JOIN t_right r
ON      r.value = l.value
WHERE   r.value IS NULL
Why should r.value = NULL eliminate records ? I am not understanding . I know I am missing something very basic but at present I cant figure out even that basic one. I would appreciate if someone explains it to me in detail . 
I want a very basic explanation.
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in the right table; the join will still return a row in the result, but with NULL in each column from the right table.
The inner join clause eliminates the rows that do not match with a row of the other table. The left join, however, returns all rows from the left table whether or not there is a matching row in the right table.
The second way to find non-matching records between tables is to use NOT in conjunction with the IN operator. The IN operator allows you to specify multiple values in a WHERE clause, much like a string of ORs chained together.
Since it's not possible to join on NULL values in SQL Server like you might expect, we need to be creative to achieve the results we want. One option is to make our AccountType column NOT NULL and set some other default value. Another option is to create a new column that will act as a surrogate key to join on instead.
This could be explained with the following
mysql> select * from table1 ;
+------+------+
| id   | val  |
+------+------+
|    1 |   10 |
|    2 |   30 |
|    3 |   40 |
+------+------+
3 rows in set (0.00 sec)
mysql> select * from table2 ;
+------+------+
| id   | t1id |
+------+------+
|    1 |    1 |
|    2 |    2 |
+------+------+
2 rows in set (0.00 sec)
Here table1.id <-> table2.t1id
Now when we do a left join with the joining key and if the left table is table1 then it will get all the data from table1 and in non-matching record on table2 will be set to null
mysql> select t1.* , t2.t1id from table1 t1 
left join table2 t2 on t2.t1id = t1.id ;
+------+------+------+
| id   | val  | t1id |
+------+------+------+
|    1 |   10 |    1 |
|    2 |   30 |    2 |
|    3 |   40 | NULL |
+------+------+------+
3 rows in set (0.00 sec)
See that table1.id = 3 does not have a value in table2 so its set as null When you apply the where condition it will do further filtering
mysql> select t1.* , t2.t1id from table1 t1 
left join table2 t2 on t2.t1id = t1.id where t2.t1id is null;
+------+------+------+
| id   | val  | t1id |
+------+------+------+
|    3 |   40 | NULL |
+------+------+------+
1 row in set (0.00 sec)
Let's assume r-table is employees, and r_table is computers. Some employees don't have computers. Some computers are not assigned to anyone yet.
Inner join:
SELECT  l.*, r.*
FROM    employees l
JOIN computers r
ON      r.id = l.comp_id
gives you the list of all employees who HAVE a computer, and the info about computer assigned for each of them. The employees without a computer will NOT appear on this list.
Left join:
SELECT  l.*, r.*
FROM    employees l
LEFT JOIN computers r
ON      r.id = l.comp_id
gives you the list of ALL employees. The employees with computer will show the computer info. The employees without computers will appear with NULLs instead of computer info.
Finally
SELECT  l.*, r.*
FROM    employees l
LEFT JOIN computers r
ON      r.id = l.comp_id
WHERE   r.id IS NULL
Left join with the WHERE clause will start with the same list as the left join (2), but then it will keep only those employees that do not have corresponding information in the computer table, that is, employees without the computers.
I this case, selecting anything from the r table will be just nulls, so you can leave those fields out and select only stuff from the l table:
SELECT  l.*
FROM ...
Try this sequence of selects and observe the output. Each next step builds on the previous one.
Please let me know if this explanation is understandable, or you'd like me to elaborate some more.
EDITED TO ADD: Here's sample code to create the two tables used above:
CREATE TABLE employees
(  id INT NOT NULL PRIMARY KEY,
   name VARCHAR(20),
   comp_id INT);
INSERT INTO employees (id, name, comp_id) VALUES (1, 'Becky', 1);
INSERT INTO employees (id, name, comp_id) VALUES (2, 'Anne', 7);
INSERT INTO employees (id, name, comp_id) VALUES (3, 'John', 3);
INSERT INTO employees (id, name) VALUES (4, 'Bob');
CREATE TABLE computers
(  id INT NOT NULL PRIMARY KEY,
   os VARCHAR(20) );
INSERT INTO computers (id, os) VALUES (1,'Windows 7');
INSERT INTO computers (id, os) VALUES (2,'Windows XP');
INSERT INTO computers (id, os) VALUES (3,'Unix');
INSERT INTO computers (id, os) VALUES (4,'Windows 7');
There are 4 employees. Becky and John have computers. Anne and Bob do not have a computer. (Anne, has a comp_id 7, which doesn't correspond to any row in computers table - so, she doesn't really have a computer.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With