MySQL function: rank table by most similar attributes

Question

I have a table of products ids and keywords that looks like the following:

+------------+------------------+------+-----+---------+----------------+
| Field      | Type             | Null | Key | Default | Extra          |
+------------+------------------+------+-----+---------+----------------+
| id         | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| product_id | int(10) unsigned | YES  | MUL | NULL    |                |
| keyword    | varchar(255)     | YES  |     | NULL    |                |
+------------+------------------+------+-----+---------+----------------+

This table simply stores product ids, and keywords associated with those products. So for example, it might contain:

+----+------------+---------+
| id | product_id | name    |
+----+------------+---------+
|  1 |         1  | soft    |
|  2 |         1  | red     |
|  3 |         1  | leather |
|  4 |         2  | cloth   |
|  5 |         2  | red     |
|  6 |         2  | new     |
|  7 |         3  | soft    |
|  8 |         3  | red     |
|  9 |         4  | blue    |
+----+------------+---------+

In other words:

product 1 is soft, red, and leather.
product 2 is cloth, red and new.
Product 3 is red and soft,
product 4 is blue.

I need some way to take in a product ID, and get back a sorted list of product ids ranked by the number of common keywords

So for example, if I pass in product_id 1, I'd expect to get back:

+----+-------+------------+
| product_id | matches    |
+------------+------------+
|     3      | 2          | (product 3 has two common keywords with product 1)
|     2      | 1          | (product 2 has one common keyword with product 1)
|     4      | 0          | (product 4 has no common keywords with product 1)
+------------+------------+

Tim Biegeleisen · Accepted Answer

One option uses a self right outer join with conditional aggregation to count the number of matched names between, e.g. product ID 1, and all other product IDs:

SELECT t2.product_id,
       SUM(CASE WHEN t1.name IS NOT NULL THEN 1 ELSE 0 END) AS matches
FROM yourTable t1
RIGHT JOIN yourTable t2
    ON t1.name = t2.name AND
       t1.product_id = 1
WHERE t2.product_id <> 1
GROUP BY t2.product_id
ORDER BY t2.product_id

Follow the link below for a running demo:

MySQL function: rank table by most similar attributes

Tags:

mysql

stored-procedures

sql-order-by

group-by

aggregate-functions

alec

1 Answers

SQLFiddle

Tim Biegeleisen

Recent Activity

Donate For Us

MySQL function: rank table by most similar attributes

Tags:

mysql

stored-procedures

sql-order-by

group-by

aggregate-functions

alec

1 Answers

SQLFiddle

Tim Biegeleisen

Related questions

Recent Activity

Donate For Us