Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slope One implementations offers poor recommendations

I'm attempting to implement a Slope One algorithm via PHP for user-based item recommendation. To do this, I'm using the OpenSlopeOne library. The problem I'm having is that the recommendations generated aren't at all relevant to the user.

Currently I have two tables: user_ratings and slope_one. The user_ratings table is fairly straight forward. It contains a per-item rating given by that particular user (user_id, item_id and user_item_rating). The slope_one table follows OpenSlopeOne's default schema: item_id1, item_id2, times and rating.

The slope_one table is populated using the following SQL procedure:

CREATE PROCEDURE `slope_one`()
begin                    
    DECLARE tmp_item_id int;
    DECLARE done int default 0;                    
    DECLARE mycursor CURSOR FOR select distinct item_id from user_ratings;
    DECLARE CONTINUE HANDLER FOR NOT FOUND set done=1;
    open mycursor;
    while (!done) do
        fetch mycursor into tmp_item_id;
        if (!done) then
            insert into slope_one (select a.item_id as item_id1,b.item_id as item_id2,count(*) as times, sum(a.rating-b.rating) as rating from user_ratings a, user_ratings b where a.item_id = tmp_item_id and b.item_id != a.item_id and a.user_id=b.user_id group by a.item_id,b.item_id);
        end if;
    END while;
    close mycursor;
end

And to fetch the most relevant recommendations for a given user, I perform the following query:

SELECT
    item.* 
FROM
    slope_one s,
    user_ratings u,
    item
WHERE 
    u.user_id = '{USER_ID}' AND 
    s.item_id1 = u.item_id AND 
    s.item_id2 != u.item_id AND
    item.id = s.item_id2
GROUP BY 
    s.item_id2 
ORDER BY
    SUM(u.rating * s.times - s.rating) / SUM(s.times) DESC
LIMIT 20

As previously stated, this just doesn't seem to be working. I'm working with a fairly large data set (10,000+ recommendations) but I'm just not seeing any form of correlation. In fact, the majority of recommendations seem to be identical for users, even with totally disparate item ratings.

like image 730
ndg Avatar asked Dec 11 '25 03:12

ndg


1 Answers

You could try the Java implementation in Apache Mahout. There is an excerpt from Mahout in Action which covers its usage. That might be useful as a second data point and help differentiate algorithm versus implementation issues.

As of Mahout 0.9 the remmenders are discontinued. See https://mahout.apache.org/

like image 178
Sean Owen Avatar answered Dec 12 '25 17:12

Sean Owen



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!