How to obtain count of record differences in the same table, where there are distinct and nearly-distinct records

Question

I've a table TABLEA with data as below

field1 field2 field3.......field16
123    10-JAN-12 0.8.......ABC
123    10-JAN-12 0.8.......ABC
.
.
.
123    10-JAN-12 0.7.......ABC
245    11-JAN-12 0.3.......CDE
245    11-JAN-12 0.3.......CDE
245    11-JAN-12 0.3.......XYZ
...
<unique rows>

When I do a

select field1, field2, ...field16 
  from TABLEA

I obtain M records,and when I do a

select distinct field1, field2...field16 
  from TABLEA

I obtain M-x records, where M is in the Millions and x is a much smaller #.

I am trying to write SQL to get the x records (eventually, just get the count). I've tried all Set operator keywords like

select field1...field16 
 from TABLEA 
 EXCEPT 
 select distinct field1..field16 
   from TABLEA

Or using UNION ALL instead of EXCEPT. But none of them return x, instead they all return 0 rows.

wolφi · Accepted Answer

You can select the rows that are not distinct by

 SELECT field1, ... , field16
   FROM tablea
  GROUP BY field1, ... , field16
 HAVING count(*) > 1

Edit: Another approach would be to use an analytical function ROW_NUMBER(), partitioning by all your field columns. The first (i.e. distinct) row for a given set of fields has ROW_NUMBER = 1, the second = 2, the third = 3 etc. So you can select the x-rows with WHERE ROW_NUMBER > 1.

CREATE TABLE tablea (
    field1 NUMBER, field2 DATE,  field3 NUMBER, field16 VARCHAR2(10)
);

INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.8, 'ABC');
INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.8, 'ABC');
INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.7, 'ABC');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'CDE');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'CDE');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'XYZ');

To select the duplicate rows x:

SELECT *
  FROM (
        SELECT field1, field2, field3, field16,
               ROWID AS rid,
               ROW_NUMBER() OVER (PARTITION BY 
               field1, field2, field3, field16 ORDER BY ROWID) as rn
          FROM tablea
        )
  WHERE rn > 1;

 123 10.01.2012 0.8 ABC AAAJ6mAAEAAAAExAAB 2
 245 11.01.2012 0.3 CDE AAAJ6mAAEAAAAExAAE 2

Skylark · Answer

you will get what you want with your own 'Except' query that you have posted above. But you must include the 'ALL' keyword in your except as 'Except Distinct' is the default. So I have just added the ALL keyword below in your query itself:

select field1...field16 from TABLEA EXCEPT ALL select distinct field1..field16 from TABLEA

If you want a count of the records of M-x then make the above query a subquery in the FROM clause of another query and have count in that outer query and you would get the count as shown below:

Select count(*) From ( select field1...field16 from TABLEA EXCEPT ALL select distinct field1..field16 from TABLEA
) B

Guess this is what you are looking for.

Good luck

How to obtain count of record differences in the same table, where there are distinct and nearly-distinct records

Tags:

oracle-database

sql

distinct

netezza

user1958640

2 Answers

wolφi

Skylark

Recent Activity

Donate For Us

How to obtain count of record differences in the same table, where there are distinct and nearly-distinct records

Tags:

oracle-database

sql

distinct

netezza

user1958640

2 Answers

wolφi

Skylark

Related questions

Recent Activity

Donate For Us