Merge Duplicate Rows in MySQL

Question

I have a database like this:

users
id    name    email                phone
1     bill    [email protected]
2     bill    [email protected]   123456789
3     susan   [email protected]
4     john    [email protected]   123456789
5     john    [email protected]   987654321

I want to merge records considered duplicates based on the email field.

Trying to figure out how to use the following considerations.

Merge based on duplicate email
If one row has a null value use the row that has the most data.
If 2 rows are duplicates but other fields are different then use the one

with the highest id number (see the [email protected] row for an example.)

Here is a query I tried:

DELETE FROM users WHERE users.id NOT IN 
(SELECT grouped.id FROM (SELECT DISTINCT ON (email) * FROM users) AS grouped)

Getting a syntax error.

I'm trying to get the database to transform to this, I can't figure out the correct query:

users
id   name    email                 phone
2    bill    [email protected]    123456789
3    susan   [email protected]   
5    john    [email protected]    987654321

Tim Biegeleisen · Accepted Answer

Here is one option using a delete join:

DELETE
FROM users
WHERE id NOT IN (SELECT id
                 FROM (
                     SELECT CASE WHEN COUNT(*) = 1
                                 THEN MAX(id)
                                 ELSE MAX(CASE WHEN phone IS NOT NULL THEN id END) END AS id
                     FROM users
                     GROUP BY email) t);

The logic of this delete is as follows:

Emails where there is only one record are not deleted
For emails with two or more records, we delete everything except for the record having the highest id value, where the phone is also defined.

Merge Duplicate Rows in MySQL

Tags:

mysql

Jordash

1 Answers

Tim Biegeleisen

Recent Activity

Donate For Us

Merge Duplicate Rows in MySQL

Tags:

mysql

Jordash

1 Answers

Tim Biegeleisen

Related questions

Recent Activity

Donate For Us