Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rails: Group By join table

I am trying to group by a delegate join table. I have a tasks table and each task is has a project_id. The following code works well in my controller for me to group by project:

@task = Task.joins(:project).joins(:urgency).where(urgencies: {urgency_value: 7}).group_by(&:project_id)

This returns a hash where the key is what I have joined by and then the index contains each tasks within that group. I can then loop through each task to retrieve its attributes.

However, each project belongs to a workspace (via a workspace_id). What I want is to have the same query but to group by the workspace. The final aim is for me to create a table which shows the workspace name in one column and the number of tasks for that workspace in the second column.

I have tried many combinations and searched many forums but after several hours still haven't been able to crack it.

like image 311
DAR_ad Avatar asked Oct 27 '25 07:10

DAR_ad


1 Answers

If your only goal is to get the task counts per workspace, I think you want a different query.

@workspaces_with_task_counts = 
  Workspace
    .joins(:projects)
    .joins(:tasks)
    .select('workspaces.name, count(tasks.id) as task_count')
    .group(:workspace_id)

Then you can access the count like this:

@workspaces_with_task_counts.each do |workspace|
  puts "#{workspace.name}: #{workspace.task_count}"
end

EDIT 1

I think this is what you want:

Workspace
  .joins(projects: { tasks: :urgencies })
  .where(urgencies: {urgency_value: 7})
  .group(:name)
  .count

which results in a hash containing all of the workspaces with at least one task where the urgency_value is 7, by name, with the number of tasks in that workspace:

{"workspace1"=>4, "workspace2"=>1}

EDIT 2

SQL is not capable of returning both detail and summary information in a single query. But, we can get all the data, then summarize it in memory with Ruby's group_by method:

Task
  .joins(project: :workspace)
  .includes(project: :workspace)
  .group_by { |task| task.project.workspace.name }

This produces the following data structure:

{
  "workspace1": [task, task, task],
  "workspace2": [task, task],
  "workspace3": [task, task, task, task]
}

But, it does so at a cost. Grouping in memory is an expensive process. Running that query 10,000 times took ~15 seconds.

It turns out that executing two SQL queries is actually two orders of magnitude faster at ~0.2 seconds. Here are the queries:

tasks = Task.joins(project: :workspace).includes(project: :workspace)
counts = tasks.group('workspaces.name').count

The first query gets you all the tasks and preloads their associated project and workspace data. The second query uses ActiveRecord's group clause to construct the SQL statement to summarize the data. It returns this data structure:

{ "workspace1": 3, "workspace2": 2, "workspace3": 4 }

Databases are super efficient at set manipulation. It's almost always significantly faster to do that work in the database than in Ruby.

like image 161
aridlehoover Avatar answered Oct 28 '25 22:10

aridlehoover



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!