I am trying to figure out how to query a table (the table is actually a result set, so it will be a subquery), group it by ColA=ColB (see below), and create a calculated field all in one step.
So, if my test data looks like
ColA ColB ColC
1 1 aaa 1 2 bbbb 1 3 cccc 2 2 dddd 3 3 eeee 3 4 ffff 3 5 gggg 3 6 hhhh 4 4 iiii 5 5 jjjj 6 6 kkkk 6 7 llll 6 8 mmmm
I would like to retrieve only rows where ColA=ColB and also add a new column that tells me whether the original data in ColA was repeated. See below.
ColA ColB ColC multiples
1 1 aaaa yes 2 2 dddd no 3 3 eeee yes 4 4 iiii no 5 5 jjjj no 6 6 kkkk yes
Can someone help me out with the syntax? I have been playing with Group By's and SubSelects to no avail. Do I need to use a case statement to for the multiples field?
It's more helpful to post create table and insert statements instead of Desc table and select * from table_name; http://tkyte.blogspot.com/2005/06/how-to-ask-questions.html
create table test_repeat(
cola number,
colb number,
colc varchar2(20)
);
insert into test_repeat values (1,1,'aaa');
insert into test_repeat values (1,2,'bbbb');
insert into test_repeat values (1,3,'cccc');
insert into test_repeat values (2,2,'dddd');
insert into test_repeat values (3,3,'eeee');
insert into test_repeat values (3,4,'ffff');
insert into test_repeat values (3,5,'gggg');
insert into test_repeat values (3,6,'hhhh');
insert into test_repeat values (4,4,'iiii');
insert into test_repeat values (5,5,'jjjj');
insert into test_repeat values (6,6,'kkkk');
insert into test_repeat values (6,7,'llll');
insert into test_repeat values (6,8,'mmmm');
commit;
You can use the Oracle analytic function Lead to look through your result set to see if colA is the same for the next row (after ordering it..) like..
select * from (select colA, colb, (case when colA = (lead(cola) over (partition by colA order by cola, colb)) then 'Yes' else 'No' end) multiples, colc from test_repeat) where colA = colb /
COLA COLB MUL COLC
1 1 Yes aaa
2 2 No dddd
3 3 Yes eeee
4 4 No iiii
5 5 No jjjj
6 6 Yes kkkk
Or you can get the count for each value of COLA and compare it to see if there are duplicates...
select a.colA, a.colb, a.colc, (case when (select count(*) from test_repeat t where t.cola = a.colA) > 1 then 'Yes' else 'No' end) Repeat from test_repeat a where colA = colB /
COLA COLB COLC REP
1 1 aaa Yes
2 2 dddd No
3 3 eeee Yes
4 4 iiii No
5 5 jjjj No
6 6 kkkk Yes
They are both equally simple, but I would suggest the analytic function approach as I have found it to be generally faster for all Queries I have worked with in the past.
SQL> select *
2 from test_repeat
3 order by cola
4 /
COLA COLB COLC
---------- ---------- --------------------
1 2 bbbb
1 1 aaa
1 3 cccc
2 2 dddd
3 4 ffff
3 3 eeee
3 5 gggg
3 6 hhhh
4 4 iiii
5 5 jjjj
6 6 kkkk
6 7 llll
6 8 mmmm
7 9 nnnn
14 rows selected.
SQL> select cola "ColA"
2 , max(decode(colb,cola,colb)) "ColB"
3 , max(decode(colb,cola,colc)) "ColC"
4 , case count(*) when 1 then 'no' else 'yes' end "multiples"
5 from test_repeat
6 group by cola
7 having cola = max(decode(colb,cola,colb))
8 order by cola
9 /
ColA ColB ColC mul
---------- ---------- -------------------- ---
1 1 aaa yes
2 2 dddd no
3 3 eeee yes
4 4 iiii no
5 5 jjjj no
6 6 kkkk yes
6 rows selected.
Performancewise, this SQL is approximately equal to Rajesh' first query. So you can choose the one you are more comfortable with.
SQL> set autotrace on
SQL> select * from
2 (select colA, colb,
3 (case when colA = (lead(cola) over
4 (partition by colA order by cola, colb))
5 then 'Yes'
6 else 'No'
7 end) multiples,
8 colc
9 from test_repeat)
10 where colA = colb
11 /
COLA COLB MUL COLC
---------- ---------- --- --------------------
1 1 Yes aaa
2 2 No dddd
3 3 Yes eeee
4 4 No iiii
5 5 No jjjj
6 6 Yes kkkk
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1491815685
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 574 | 4 (25)| 00:00:01 |
|* 1 | VIEW | | 14 | 574 | 4 (25)| 00:00:01 |
| 2 | WINDOW SORT | | 14 | 532 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST_REPEAT | 14 | 532 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLA"="COLB")
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
4 recursive calls
0 db block gets
15 consistent gets
0 physical reads
0 redo size
421 bytes sent via SQL*Net to client
238 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
SQL> /
COLA COLB MUL COLC
---------- ---------- --- --------------------
1 1 Yes aaa
2 2 No dddd
3 3 Yes eeee
4 4 No iiii
5 5 No jjjj
6 6 Yes kkkk
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1491815685
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 574 | 4 (25)| 00:00:01 |
|* 1 | VIEW | | 14 | 574 | 4 (25)| 00:00:01 |
| 2 | WINDOW SORT | | 14 | 532 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST_REPEAT | 14 | 532 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLA"="COLB")
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
421 bytes sent via SQL*Net to client
238 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
SQL> select cola "ColA"
2 , max(decode(colb,cola,colb)) "ColB"
3 , max(decode(colb,cola,colc)) "ColC"
4 , case count(*) when 1 then 'no' else 'yes' end "multiples"
5 from test_repeat
6 group by cola
7 having cola = max(decode(colb,cola,colb))
8 order by cola
9 /
ColA ColB ColC mul
---------- ---------- -------------------- ---
1 1 aaa yes
2 2 dddd no
3 3 eeee yes
4 4 iiii no
5 5 jjjj no
6 6 kkkk yes
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3021378319
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 532 | 4 (25)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | SORT GROUP BY | | 14 | 532 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST_REPEAT | 14 | 532 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLA"=MAX(DECODE("COLB","COLA","COLB")))
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
421 bytes sent via SQL*Net to client
238 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
SQL> /
ColA ColB ColC mul
---------- ---------- -------------------- ---
1 1 aaa yes
2 2 dddd no
3 3 eeee yes
4 4 iiii no
5 5 jjjj no
6 6 kkkk yes
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3021378319
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 532 | 4 (25)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | SORT GROUP BY | | 14 | 532 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| TEST_REPEAT | 14 | 532 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COLA"=MAX(DECODE("COLB","COLA","COLB")))
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
420 bytes sent via SQL*Net to client
238 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
SQL> set autotrace off
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With