I have a database like this:
users
id name email phone
1 bill [email protected]
2 bill [email protected] 123456789
3 susan [email protected]
4 john [email protected] 123456789
5 john [email protected] 987654321
I want to merge records considered duplicates based on the email field.
Trying to figure out how to use the following considerations.
with the highest id number (see the [email protected] row for an example.)
Here is a query I tried:
DELETE FROM users WHERE users.id NOT IN
(SELECT grouped.id FROM (SELECT DISTINCT ON (email) * FROM users) AS grouped)
Getting a syntax error.
I'm trying to get the database to transform to this, I can't figure out the correct query:
users
id name email phone
2 bill [email protected] 123456789
3 susan [email protected]
5 john [email protected] 987654321
Here is one option using a delete join:
DELETE
FROM users
WHERE id NOT IN (SELECT id
FROM (
SELECT CASE WHEN COUNT(*) = 1
THEN MAX(id)
ELSE MAX(CASE WHEN phone IS NOT NULL THEN id END) END AS id
FROM users
GROUP BY email) t);
The logic of this delete is as follows:
id
value, where the phone is also defined.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With