I have a table of products ids and keywords that looks like the following:
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| product_id | int(10) unsigned | YES | MUL | NULL | |
| keyword | varchar(255) | YES | | NULL | |
+------------+------------------+------+-----+---------+----------------+
This table simply stores product ids, and keywords associated with those products. So for example, it might contain:
+----+------------+---------+
| id | product_id | name |
+----+------------+---------+
| 1 | 1 | soft |
| 2 | 1 | red |
| 3 | 1 | leather |
| 4 | 2 | cloth |
| 5 | 2 | red |
| 6 | 2 | new |
| 7 | 3 | soft |
| 8 | 3 | red |
| 9 | 4 | blue |
+----+------------+---------+
In other words:
1 is soft, red, and leather.2 is cloth, red and new.3 is red and soft,4 is blue.I need some way to take in a product ID, and get back a sorted list of product ids ranked by the number of common keywords
So for example, if I pass in product_id 1, I'd expect to get back:
+----+-------+------------+
| product_id | matches |
+------------+------------+
| 3 | 2 | (product 3 has two common keywords with product 1)
| 2 | 1 | (product 2 has one common keyword with product 1)
| 4 | 0 | (product 4 has no common keywords with product 1)
+------------+------------+
One option uses a self right outer join with conditional aggregation to count the number of matched names between, e.g. product ID 1, and all other product IDs:
SELECT t2.product_id,
SUM(CASE WHEN t1.name IS NOT NULL THEN 1 ELSE 0 END) AS matches
FROM yourTable t1
RIGHT JOIN yourTable t2
ON t1.name = t2.name AND
t1.product_id = 1
WHERE t2.product_id <> 1
GROUP BY t2.product_id
ORDER BY t2.product_id
Follow the link below for a running demo:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With