Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Join between tables with conditions

I'm thinking about which should be the best way (considering the execution time) of doing a join between 2 or more tables with some conditions. I got these three ways:

FIRST WAY:

select * from 
TABLE A inner join TABLE B on A.KEY = B.KEY
where 
B.PARAM=VALUE

SECOND WAY

select * from 
TABLE A inner join TABLE B on A.KEY = B.KEY 
and B.PARAM=VALUE

THIRD WAY

select * from
TABLE A inner join (Select * from TABLE B where B.PARAM=VALUE) J ON A.KEY=J.KEY

Consider that tables have more than 1 milion of rows.

What your opinion? Which should be the right way, if exists?

like image 589
user2010955 Avatar asked Oct 23 '25 10:10

user2010955


2 Answers

Usually putting the condition in where clause or join condition has no noticeable differences in inner joins.
If you are using outer joins ,putting the condition in the where clause improves query time because when you use condition in the where clause of left outer joins, rows which aren't met the condition will be deleted from the result set and the result set becomes smaller. But if you use the condition in join clause of left outer joins ,no rows deletes and result set is bigger in comparison to using condition in the where clause. for more clarification,follow the example.

create table A

( ano NUMBER,

aname VARCHAR2(10),

rdate DATE )

----A data

insert into A select 1,'Amand',to_date('20130101','yyyymmdd') from dual; commit;

insert into A select 2,'Alex',to_date('20130101','yyyymmdd') from dual; commit;

insert into A select 3,'Angel',to_date('20130201','yyyymmdd') from dual;

commit;

create table B

( bno NUMBER,

bname VARCHAR2(10),

rdate DATE )

insert into B select 3,'BOB',to_date('20130201','yyyymmdd') from dual; commit;

insert into B select 2,'Br',to_date('20130101','yyyymmdd') from dual; commit;

insert into B select 1,'Bn',to_date('20130101','yyyymmdd') from dual; commit;

first of all we have normal query which joins 2 tables with each other:

select * from a inner join b on a.ano=b.bno

the result set has 3 records. now please run below queries:

select * from a inner join b on a.ano=b.bno and a.rdate=to_date('20130101','yyyymmdd')

    select * from a inner join b on a.ano=b.bno where a.rdate=to_date('20130101','yyyymmdd')

as you see above results row counts have no differences,and According to my experience there is no noticeable performance differences for data in large volume.

please run below queries:

select * from a left outer  join b on a.ano=b.bno and a.rdate=to_date('20130101','yyyymmdd')

in this case,the count of output records will be equal to table A records.

select * from a left outer  join b on a.ano=b.bno where a.rdate=to_date('20130101','yyyymmdd')

in this case , records of A which didn't met the condition deleted from the result set and as I said the result set will have less records(in this case 2 records).

According to above examples we can have following conclusions:

1-in case of using inner joins, there is no special differences between putting condition in where clause or join clause ,but please try to put tables in from clause in order to have minimum intermediate result row counts: (http://www.dba-oracle.com/art_dbazine_oracle10g_dynamic_sampling_hint.htm)

2-In case of using outer joins,whenever you don't care of exact result row counts (don't care of missing records of table A which have no paired records in table B and fields of table B will be null for these records in the result set),put the condition in the where clause to delete a set of rows which aren't met the condition and obviously improve query time by decreasing the result row counts.

but in special cases you HAVE TO put the condition in the join part.for example if you want that your result row count will be equal to table 'A' row counts(this case is common in ETL processes) you HAVE TO put the condition in the join clause.

3-avoiding subquery is recommended by lots of reliable resources and expert programmers.It usually increase the query time and you can use subquery just when its result data set is small.

I hope this will be useful:)

like image 164
tooba jalali Avatar answered Oct 25 '25 20:10

tooba jalali


You need to look at the execution plans for the queries to judge which is the most computationally efficient. As pointed out in the comments you may find they are equivalent. Here is some information on Oracle execution plans. Depending on what editor / IDE you use the may be a shortcut for this e.g. F5 in PL/SQL Developer.

like image 40
ChrisProsser Avatar answered Oct 25 '25 19:10

ChrisProsser



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!