Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: Detect existence of at least 1 record in large joined table

Tags:

sql

join

mysql

I have two tables:

users (id, name)

user_activities (id, user_id, activity_id, created_at)

The user_activities table is very large with over 300 million rows.

I am trying to detect which users have done any activity between a given date range. In other words, rows on the user table, where a joined row exists on the user_activities table between a certain created_at range.

I can do this with an INNER JOIN, GROUP BY and WHERE clause but the query runs for a long time as I believe its hitting all user_activities rows between my date range.

I don't really care "how many" activities, just if they've had more than zero. So i am grouping to get a count (e.g. 210 activities) when actually I could stop after finding just 1.

Is there a more efficient way to do this rather than grouping all user_activity rows to count them?

For info, here's the current query, which works fine but take a long time:

SELECT u.id, u.name, COUNT(ua.id) AS activity_count
FROM users u
INNER JOIN user_activity ua ON u.id=ua.user_id
WHERE ua.created_at > '2017-01-01' AND ua.created_at < '2017-03-01'
GROUP BY u.id
HAVING activity_count > 0;

Thanks in advance!

like image 491
infaddict Avatar asked Jan 19 '26 17:01

infaddict


1 Answers

You can try this version:

SELECT u.id, u.name,
       (SELECT COUNT(*)
        FROM user_activity ua 
        WHERE u.id = ua.user_id AND
              ua.created_at > '2017-01-01' AND
              ua.created_at < '2017-03-01'
       ) as activity_count
FROM users u
HAVING activity_count > 0;

For performance you want an index on user_activity(user_id, created_at).

EDIT:

If you just want existence, then use the same index and this should be much faster:

SELECT u.id, u.name
FROM users u
WHERE EXISTS (SELECT 1
              FROM user_activity ua 
              WHERE u.id = ua.user_id AND
                    ua.created_at > '2017-01-01' AND
                    ua.created_at < '2017-03-01'
              );

Whereas your query does complex processing and then aggregation of a bunch of data, this should scan the users table, and just look up in the index whether an appropriate activity exists for the user.

like image 110
Gordon Linoff Avatar answered Jan 21 '26 06:01

Gordon Linoff