Rails: Group By join table

Question

I am trying to group by a delegate join table. I have a tasks table and each task is has a project_id. The following code works well in my controller for me to group by project:

@task = Task.joins(:project).joins(:urgency).where(urgencies: {urgency_value: 7}).group_by(&:project_id)

This returns a hash where the key is what I have joined by and then the index contains each tasks within that group. I can then loop through each task to retrieve its attributes.

However, each project belongs to a workspace (via a workspace_id). What I want is to have the same query but to group by the workspace. The final aim is for me to create a table which shows the workspace name in one column and the number of tasks for that workspace in the second column.

I have tried many combinations and searched many forums but after several hours still haven't been able to crack it.

aridlehoover · Accepted Answer

If your only goal is to get the task counts per workspace, I think you want a different query.

@workspaces_with_task_counts = 
  Workspace
    .joins(:projects)
    .joins(:tasks)
    .select('workspaces.name, count(tasks.id) as task_count')
    .group(:workspace_id)

Then you can access the count like this:

@workspaces_with_task_counts.each do |workspace|
  puts "#{workspace.name}: #{workspace.task_count}"
end

EDIT 1

I think this is what you want:

Workspace
  .joins(projects: { tasks: :urgencies })
  .where(urgencies: {urgency_value: 7})
  .group(:name)
  .count

which results in a hash containing all of the workspaces with at least one task where the urgency_value is 7, by name, with the number of tasks in that workspace:

{"workspace1"=>4, "workspace2"=>1}

EDIT 2

SQL is not capable of returning both detail and summary information in a single query. But, we can get all the data, then summarize it in memory with Ruby's group_by method:

Task
  .joins(project: :workspace)
  .includes(project: :workspace)
  .group_by { |task| task.project.workspace.name }

This produces the following data structure:

{
  "workspace1": [task, task, task],
  "workspace2": [task, task],
  "workspace3": [task, task, task, task]
}

But, it does so at a cost. Grouping in memory is an expensive process. Running that query 10,000 times took ~15 seconds.

It turns out that executing two SQL queries is actually two orders of magnitude faster at ~0.2 seconds. Here are the queries:

tasks = Task.joins(project: :workspace).includes(project: :workspace)
counts = tasks.group('workspaces.name').count

The first query gets you all the tasks and preloads their associated project and workspace data. The second query uses ActiveRecord's group clause to construct the SQL statement to summarize the data. It returns this data structure:

{ "workspace1": 3, "workspace2": 2, "workspace3": 4 }

Databases are super efficient at set manipulation. It's almost always significantly faster to do that work in the database than in Ruby.

Rails: Group By join table

Tags:

join

group-by

ruby-on-rails

DAR_ad

1 Answers

aridlehoover

Recent Activity

Donate For Us

Rails: Group By join table

Tags:

join

group-by

ruby-on-rails

DAR_ad

1 Answers

aridlehoover

Related questions

Recent Activity

Donate For Us