Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to obtain count of record differences in the same table, where there are distinct and nearly-distinct records

I've a table TABLEA with data as below

field1 field2 field3.......field16
123    10-JAN-12 0.8.......ABC
123    10-JAN-12 0.8.......ABC
.
.
.
123    10-JAN-12 0.7.......ABC
245    11-JAN-12 0.3.......CDE
245    11-JAN-12 0.3.......CDE
245    11-JAN-12 0.3.......XYZ
...
<unique rows>

When I do a

select field1, field2, ...field16 
  from TABLEA

I obtain M records,and when I do a

select distinct field1, field2...field16 
  from TABLEA

I obtain M-x records, where M is in the Millions and x is a much smaller #.

I am trying to write SQL to get the x records (eventually, just get the count). I've tried all Set operator keywords like

select field1...field16 
 from TABLEA 
 EXCEPT 
 select distinct field1..field16 
   from TABLEA  

Or using UNION ALL instead of EXCEPT. But none of them return x, instead they all return 0 rows.

like image 321
user1958640 Avatar asked Dec 04 '25 14:12

user1958640


2 Answers

You can select the rows that are not distinct by

 SELECT field1, ... , field16
   FROM tablea
  GROUP BY field1, ... , field16
 HAVING count(*) > 1

Edit: Another approach would be to use an analytical function ROW_NUMBER(), partitioning by all your field columns. The first (i.e. distinct) row for a given set of fields has ROW_NUMBER = 1, the second = 2, the third = 3 etc. So you can select the x-rows with WHERE ROW_NUMBER > 1.

CREATE TABLE tablea (
    field1 NUMBER, field2 DATE,  field3 NUMBER, field16 VARCHAR2(10)
);

INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.8, 'ABC');
INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.8, 'ABC');
INSERT INTO tablea VALUES (123, DATE '2012-01-10', 0.7, 'ABC');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'CDE');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'CDE');
INSERT INTO tablea VALUES (245, DATE '2012-01-11', 0.3, 'XYZ');

To select the duplicate rows x:

SELECT *
  FROM (
        SELECT field1, field2, field3, field16,
               ROWID AS rid,
               ROW_NUMBER() OVER (PARTITION BY 
               field1, field2, field3, field16 ORDER BY ROWID) as rn
          FROM tablea
        )
  WHERE rn > 1;

 123 10.01.2012 0.8 ABC AAAJ6mAAEAAAAExAAB 2
 245 11.01.2012 0.3 CDE AAAJ6mAAEAAAAExAAE 2
like image 110
wolφi Avatar answered Dec 07 '25 03:12

wolφi


you will get what you want with your own 'Except' query that you have posted above. But you must include the 'ALL' keyword in your except as 'Except Distinct' is the default. So I have just added the ALL keyword below in your query itself:

select field1...field16 from TABLEA EXCEPT ALL select distinct field1..field16 from TABLEA

If you want a count of the records of M-x then make the above query a subquery in the FROM clause of another query and have count in that outer query and you would get the count as shown below:

Select count(*) From ( select field1...field16 from TABLEA EXCEPT ALL select distinct field1..field16 from TABLEA
) B

Guess this is what you are looking for.

Good luck

like image 31
Skylark Avatar answered Dec 07 '25 04:12

Skylark



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!