This is a problem that I am stumbling across frequently. There have been some similar questions about this problem, but none of them was very complete (And they might possibly be outdated since Rails 4 might have introduced new functions that help with this problem)
Let me give a simple example of the problem and the known ways to 'solve' the problem:
Say I have a User Model and a Post Model, and a User has_many :posts
Now, I want to get a top five of the users with the most posts.
The following are options I know, but they all have their own drawbacks:
1)
users = User.all
@top_users = users.sort {|a,b| a.posts.count <=> b.posts.count}.take(5)
Drawbacks: A DataBase request is made for each user, making this solution very slow.
2) Use SQL code directly with a Join (See for instance this question and answer)
select('users.*, COUNT(posts.id) AS posts_count').joins(:posts).group('users.id').order('posts_count DESC').take(5)
This runs all sorting logic in the DataBase. However:
3) Use SQL directly with an Outer Join (see for instance this question and answers)
User.select("users.*, COUNT(posts.id) as posts_count").joins("LEFT OUTER JOIN posts ON posts.user_id = users.id").group("posts.id").order("posts_count DESC")
This also returns users without posts. Drawbacks:
4) Use a Counter Cache Column (For a full explanation of this technique, see this Railscasts episode)
basically, create a new column on the User that keeps track of the current count of posts for that user by changing the value in the field each time a new post is created or deleted.
This is very fast and readable. The drawback is that we can only use this after we have defined a new field on the User. For many situations this is acceptable, but it will be harder to make flexible because the users table needs to be changed for this to work per association we might want to create a top-five for. Also, as this is a cached field, there are database manipulations that will not trigger an update on the field.
Is there a nicer(readable and efficient) way to accomplish this? Preferable something that uses built-in ActiveRecord methods.
Another method, with some limitations that might make it more of a part solution:
User.where(:id => Post.group(:user_id).
                       order("count(*) desc").
                       limit(5).
                       keys)
This would be extremely efficient in database terms at finding the five users with the highest numbers of posts as it only needs to scan an index on the posts table's user_id columns, so would be good for very large data sets. It's also pretty "clean" Rails/ActiveRecord code that ought to be practically database independent.
If returning the Users in their post-count order is critical then a less efficient sorting method could be used once those five have been identified, or the retrieval order of the keys could be used in ruby to sort the returned Users.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With