Imagine a table of contacts, where the same contact has multiple entries, but with differing data. How would one go about selecting this data for review? Unfortunately, a merge of sorts would be disagreeable as there may exist visually identifiable erroneous data that is not currently envisaged to be automatically processed.
FName LName Email Phone
Heywood Yapinchme 555-555-555
Heywood Yapinchme [email protected]
Seymour Butz [email protected]
Seymour Butz 555-555-556
Seymour Butz
Hughe Jass [email protected] 555-555-557
Amanda Hugginkiss [email protected]
I would like to see just the duplicates of the first two columns where more than one entry exists. i.e.
FName LName Email Phone
Heywood Yapinchme 555-555-555
Heywood Yapinchme [email protected]
Seymour Butz [email protected]
Seymour Butz 555-555-556
Seymour Butz
The next step of review is in the blue ether. Currently a little over a million rows, Bart has been busy. But beefy servers and this isn't regular operation but a one off to deal with data migration, so can be slightly gash.
I have tried a bit with SELECT DISTINCT and GROUP BY but it seems to just return on of each contact.
You can use aggregation to identify the duplicates:
SELECT FName, LName
FROM tablename
GROUP BY FName, LName
HAVING COUNT(*) > 1
and if you want all the rows of the duplicates:
SELECT *
FROM tablename
WHERE (FName, LName) IN (
SELECT FName, LName
FROM tablename
GROUP BY FName, LName
HAVING COUNT(*) > 1
)
If your MySql/MariaDB version supports window functions:
SELECT t.*
FROM (
SELECT *, COUNT(*) OVER (PARTITION BY FName, LName) counter
FROM tablename
) t
WHERE t.counter > 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With