It looks like we might have some duplicate values in our preference table so I want to delete all the duplicate entries for each user but the more recent one. I've spent quite a while trying to come up with a solution and I just keep running around in circles. I've tried max, grouped by, sub-queries, etc. It's pretty easy to delete the highest, but not everything but the highest. And especially for each user
So in the database I might have:
id  :   name:       value   :   userId
1   :   sortOrder:  Asc     :   1
2   :   sortOrder:  Desc    :   1
3   :   sortOrder:  Asc     :   2
4   :   something:  value2  :   1
So in this case, I want to remove the first row since it's duplicate for User 1 and has the highest id and keep all the rest of the columns.
So I know it's at least a sub-query with group by, but I just can't figure it out. So far I have:
SELECT 
    MAX(id),
    name
FROM 
    preference
GROUP BY
    name
which gives me a list of those I want to keep, BUT it still misses for each user. So I somewhat want the opposite if I had it for each user. However I don't believe I can use <> in a sub-query anyways.
What I want to ultimately do is say delete all those that have a lower ID for the same User that are duplicate entries.
One way, assuming Id column is a unique Id:
DELETE FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)
[Note: The usual advice applies. don't run in production without testing first]
Test First with:
SELECT * FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)
SQLFiddle is here. Works as intended. I suggest testing with a bigger data set though...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With