Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to filter association_ids for an ActiveRecord model?

In a domain like this:

 class User
  has_many :posts
  has_many :topics, :through => :posts
 end
 class Post
   belongs_to :user
   belongs_to :topic
 end
 class Topic
   has_many :posts
 end

I can read all the Topic ids through user.topic_ids but I can't see a way to apply filtering conditions to this method, since it returns an Array instead of a ActiveRecord::Relation.

The problem is, given a User and an existing set of Topics, marking the ones for which there is a post by the user. I am currently doing something like this:

 def mark_topics_with_post(user, topics)
   # only returns the ids of the topics for which this user has a post
   topic_ids = user.topic_ids 
   topics.each {|t| t[:has_post]=topic_ids.include(t.id)}
 end 

But this loads all the topic ids regardless of the input set. Ideally, I'd like to do something like

 def mark_topics_with_post(user, topics)
   # only returns the topics where user has a post within the subset of interest
   topic_ids = user.topic_ids.where(:id=>topics.map(&:id))
   topics.each {|t| t[:has_post]=topic_ids.include(t.id)}
 end 

But the only thing I can do concretely is

 def mark_topics_with_post(user, topics)
   # needlessly create Post objects only to unwrap them later
   topic_ids = user.posts.where(:topic_id=>topics.map(&:id)).select(:topic_id).map(&:topic_id)
   topics.each {|t| t[:has_post]=topic_ids.include(t.id)}
 end 

Is there a better way? Is it possible to have something like select_values on a association or scope? FWIW, I'm on rails 3.0.x, but I'd be curious about 3.1 too.

Why am I doing this?

Basically, I have a result page for a semi-complex search (which happens based on the Topic data only), and I want to mark the results (Topics) as stuff on which the user has interacted (wrote a Post).

So yeah, there is another option which would be doing a join [Topic,Post] so that the results come out as marked or not from the search, but this would destroy my ability to cache the Topic query (the query, even without the join, is more expensive than fetching only the ids for the user)

Notice the approaches outlined above do work, they just feel suboptimal.

like image 248
riffraff Avatar asked Jan 22 '26 01:01

riffraff


2 Answers

I think that your second solution is almost the optimal one (from the point of view of the queries involved), at least with respect to the one you'd like to use.

user.topic_ids generates the query:

SELECT `topics`.id FROM `topics` 
INNER JOIN `posts` ON `topics`.`id` = `posts`.`topic_id` 
WHERE `posts`.`user_id` = 1

if user.topic_ids.where(:id=>topics.map(&:id)) was possible it would have generated this:

SELECT topics.id FROM `topics` 
INNER JOIN `posts` ON `topics`.`id` = `posts`.`topic_id` 
WHERE `posts`.`user_id` = 1 AND `topics`.`id` IN (...)

this is exactly the same query that is generated doing: user.topics.select("topics.id").where(:id=>topics.map(&:id))

while user.posts.select(:topic_id).where(:topic_id=>topics.map(&:id)) generates the following query:

SELECT topic_id FROM `posts` 
WHERE `posts`.`user_id` = 1 AND `posts`.`topic_id` IN (...)

which one of the two is more efficient depends on the data in the actual tables and indices defined (and which db is used).

If the topic ids list for the user is long and has topics repeated many times, it may make sense to group by topic id at the query level:

user.posts.select(:topic_id).group(:topic_id).where(:topic_id=>topics.map(&:id))
like image 76
LucaM Avatar answered Jan 24 '26 18:01

LucaM


Suppose your Topic model has a column named id you can do something like this

Topic.select(:id).join(:posts).where("posts.user_id = ?", user_id)

This will run only one query against your database and will give you all the topics ids that have posts for a given user_id

like image 29
cristian Avatar answered Jan 24 '26 16:01

cristian



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!