Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use function value from subquery in WHERE-clause + fetch it with results using DBIx::Class

I try to build a complex query in DBIx::Class that

  • prefetches a related row
  • calculates an aggregate function in a subquery
  • filters results according to the function value
  • fetch the function value with the rows

In my application this is one of the table with the most rows. Therefor I want to do calculation and filtering in 1 query.

I tried to make a simple example by using the DBICTest from DBIx::Class’s Git repo. See https://github.com/Perl5/DBIx-Class/tree/maint/0.0828xx/t/lib/DBICTest

BEGIN { do "./t/lib/ANFANG.pm" or die( $@ || $! ) }

use strict;
use warnings;

use lib 'lib';
use DBICTest;

$ENV{DBIC_TRACE} = 1;
$ENV{DBIC_TRACE_PROFILE} = 'console';

my $schema = DBICTest->init_schema();

my $rs = $schema->resultset('CD');

my $subquery_rs = $rs->search(
    undef,
    {
        '+select' => [
            {
                '' => $rs->search_related('tracks')    # in reality I use correlate()
                  ->count_rs->as_query,
                -as => 'correlated_subquery_col'
            }
        ],
        '+as' => ['correlated_subquery_key'],
        alias => 'subquery',     # from approach number 4 and 6
    }
)->as_subselect_rs;

$subquery_rs->search(
    {
        correlated_subquery_col => { '<' => 3 },
    },
    {
        alias      => 'subquery',    # from approach number 5 and 6
        prefetch   => 'artist',
        '+columns' => { correlated_subquery_key => 'correlated_subquery_col' },
    }
)->all;

Run the code like this:

$ git clone https://github.com/Perl5/DBIx-Class.git
$ cd DBIx-Class/
$ git checkout master
$ perl test.pl # file shown above

(In my real project I use DBIx::Class::Helper::Correlate to sum up related values but I couldn’t use it in the example.)

The desired SQL boils down to this for the example schema:

-- pseudo code
SELECT * FROM ( -- all columns from main and related table as well as function value
    SELECT cd.*, -- main table
           artist.*, -- prefetched related table
           (SELECT COUNT(*) FROM track WHERE track.cd = cd.cdid) AS track_count -- subquery with COUNT/SUM
    FROM cds, -- main table
    JOIN artist ON artist.artistid = cd.artist -- related table
)
WHERE track_count < ?

I tried to come closer to a solution in these steps:

  1. At first I had the problem that PostgreSQL wouldn’t let me reference the result from the function in the additional column but this question helped me. So I need to use another subquery. I found and used ResultSet->as_subselect_rs() for the first time to create that.

  2. I needed to actively select the function column from the subquery in the outer search() call. Usually I use the '+columns' => { result_hashref_key => 'sql_column_name' } syntax to select extra columns but I found the syntax below necessary in order to define the column name correlated_subquery_col. The name needs to be referenced in the WHERE-clause, too. I am not aware of documentation for this weird syntax with empty hash key select => [ { '' => ..., -as => ... } ] but was inspired to try this by this question.

  3. Without the alias attribute this actually worked in the example with SQLite but in PostgreSQL with my real schema I got the error message:

DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed:
 ERROR:  column reference "id" is ambiguous
LINE 1: SELECT "items"."id", "items"."purchase_list_id", "items"."va...
               ^ [for Statement "SELECT "items"."id", "items"."purchase_list_id", "items"."value", "items"."offset", "items"."unit_id", "items"."article_id", "items"."purchased", "items"."comment", "items"."servings_sum", "items"."preorder_servings" FROM (SELECT "items"."id", "items"."purchase_list_id", "items"."value", "items"."offset", "items"."unit_id", "items"."article_id", "items"."purchased", "items"."comment", ( (SELECT SUM( "dish"."servings" ) FROM "dish_ingredients" "ingredients_alias"  JOIN "dishes" "dish" ON "dish"."id" = "ingredients_alias"."dish_id" WHERE ( "ingredients_alias"."item_id" = "items"."id" )) ) AS "servings_sum", "article"."id", "article"."project_id", "article"."shop_section_id", "article"."shelf_life_days", "article"."preorder_servings", "article"."preorder_workdays", "article"."name", "article"."comment" FROM "purchase_lists" "me"  JOIN "items" "items" ON "items"."purchase_list_id" = "me"."id"  JOIN "articles" "article" ON "article"."id" = "items"."article_id" WHERE ( ( "article"."preorder_servings" IS NOT NULL AND "article"."preorder_workdays" IS NOT NULL AND "me"."project_id" = ? ) )) "items" WHERE ( "servings_sum" >= "preorder_servings" )" with ParamValues: 1='150']
  1. Using the alias attribute in the subquery search() I tried to give the subquery distinct table aliases. That produces this SQL in SQLite:
SELECT subquery.cdid, subquery.artist, subquery.title, subquery.year, subquery.genreid, subquery.single_track, subquery.correlated_subquery_col, artist.artistid, artist.name, artist.rank, artist.charfield FROM (SELECT subquery.cdid, subquery.artist, subquery.title, subquery.year, subquery.genreid, subquery.single_track, ( (SELECT COUNT( * ) FROM cd me  JOIN track tracks ON tracks.cd = me.cdid) ) AS correlated_subquery_col FROM cd subquery) subquery  JOIN artist artist ON artist.artistid = subquery.artist WHERE ( correlated_subquery_col < ? )

Unfortunately not only is the subquery aliased to subquery for the outer query but also inside the subquery DBIC tries to SELECT from subquery. and JOIN using subquery. columns. In PostgreSQL this produces this error:

DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed:
 ERROR:  missing FROM-clause entry for table "subquery"
LINE 1: ...d"  JOIN "articles" "article" ON "article"."id" = "subquery"...
                                                             ^ [for Statement "SELECT "subquery"."id", "subquery"."purchase_list_id", "subquery"."value", "subquery"."offset", "subquery"."unit_id", "subquery"."article_id", "subquery"."purchased", "subquery"."comment", "subquery"."servings_sum", "subquery"."preorder_servings" FROM (SELECT "subquery"."id", "subquery"."purchase_list_id", "subquery"."value", "subquery"."offset", "subquery"."unit_id", "subquery"."article_id", "subquery"."purchased", "subquery"."comment", ( (SELECT SUM( "dish"."servings" ) FROM "dish_ingredients" "ingredients_alias"  JOIN "dishes" "dish" ON "dish"."id" = "ingredients_alias"."dish_id" WHERE ( "ingredients_alias"."item_id" = "items"."id" )) ) AS "servings_sum", "article"."id", "article"."project_id", "article"."shop_section_id", "article"."shelf_life_days", "article"."preorder_servings", "article"."preorder_workdays", "article"."name", "article"."comment" FROM "purchase_lists" "me"  JOIN "items" "items" ON "items"."purchase_list_id" = "me"."id"  JOIN "articles" "article" ON "article"."id" = "subquery"."article_id" WHERE ( ( "article"."preorder_servings" IS NOT NULL AND "article"."preorder_workdays" IS NOT NULL AND "me"."project_id" = ? ) )) "subquery" WHERE ( "servings_sum" >= "preorder_servings" )" with ParamValues: 1='150']
  1. Alternatively I tried to use the alias attribute in the outer query’s search(). Then the alias is only used in the outer query but the subquery doesn’t get aliased at all and the columns can’t be found.
DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed:
 ERROR:  missing FROM-clause entry for table "subquery"
LINE 1: SELECT "subquery"."id", "subquery"."purchase_list_id", "subq...
               ^ [for Statement "SELECT "subquery"."id", "subquery"."purchase_list_id", "subquery"."value", "subquery"."offset", "subquery"."unit_id", "subquery"."article_id", "subquery"."purchased", "subquery"."comment", "subquery"."servings_sum", "subquery"."preorder_servings" FROM (SELECT "items"."id", "items"."purchase_list_id", "items"."value", "items"."offset", "items"."unit_id", "items"."article_id", "items"."purchased", "items"."comment", ( (SELECT SUM( "dish"."servings" ) FROM "dish_ingredients" "ingredients_alias"  JOIN "dishes" "dish" ON "dish"."id" = "ingredients_alias"."dish_id" WHERE ( "ingredients_alias"."item_id" = "items"."id" )) ) AS "servings_sum", "article"."id", "article"."project_id", "article"."shop_section_id", "article"."shelf_life_days", "article"."preorder_servings", "article"."preorder_workdays", "article"."name", "article"."comment" FROM "purchase_lists" "me"  JOIN "items" "items" ON "items"."purchase_list_id" = "me"."id"  JOIN "articles" "article" ON "article"."id" = "items"."article_id" WHERE ( ( "article"."preorder_servings" IS NOT NULL AND "article"."preorder_workdays" IS NOT NULL AND "me"."project_id" = ? ) )) "items" WHERE ( "servings_sum" >= "preorder_servings" )" with ParamValues: 1='150']
  1. At last I tried setting alias => 'subquery' in both search() calls. This gives the same error message although the alias is now used in all SELECT column names. Weirdly though, DBIC still uses the default table alias me and writes FROM "purchase_lists" "me" in PostgreSQL:
DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed:
 ERROR:  missing FROM-clause entry for table "subquery"
LINE 1: ...d"  JOIN "articles" "article" ON "article"."id" = "subquery"...
                                                             ^ [for Statement "SELECT "subquery"."id", "subquery"."purchase_list_id", "subquery"."value", "subquery"."offset", "subquery"."unit_id", "subquery"."article_id", "subquery"."purchased", "subquery"."comment", "subquery"."servings_sum", "subquery"."preorder_servings" FROM (SELECT "subquery"."id", "subquery"."purchase_list_id", "subquery"."value", "subquery"."offset", "subquery"."unit_id", "subquery"."article_id", "subquery"."purchased", "subquery"."comment", ( (SELECT SUM( "dish"."servings" ) FROM "dish_ingredients" "ingredients_alias"  JOIN "dishes" "dish" ON "dish"."id" = "ingredients_alias"."dish_id" WHERE ( "ingredients_alias"."item_id" = "items"."id" )) ) AS "servings_sum", "article"."id", "article"."project_id", "article"."shop_section_id", "article"."shelf_life_days", "article"."preorder_servings", "article"."preorder_workdays", "article"."name", "article"."comment" FROM "purchase_lists" "me"  JOIN "items" "items" ON "items"."purchase_list_id" = "me"."id"  JOIN "articles" "article" ON "article"."id" = "subquery"."article_id" WHERE ( ( "article"."preorder_servings" IS NOT NULL AND "article"."preorder_workdays" IS NOT NULL AND "me"."project_id" = ? ) )) "subquery" WHERE ( "servings_sum" >= "preorder_servings" )" with ParamValues: 1='150']

… but FROM cd subquery in the SQLite example:

SELECT subquery.cdid, subquery.artist, subquery.title, subquery.year, subquery.genreid, subquery.single_track, subquery.correlated_subquery_col, artist.artistid, artist.name, artist.rank, artist.charfield FROM (
    SELECT subquery.cdid, subquery.artist, subquery.title, subquery.year, subquery.genreid, subquery.single_track, ( (
        SELECT COUNT( * ) FROM cd me  JOIN track tracks ON tracks.cd = me.cdid
    ) ) AS correlated_subquery_col
    FROM cd subquery -- uses alias here in SQLite but not in PostgreSQL
) subquery
JOIN artist artist ON artist.artistid = subquery.artist
WHERE ( correlated_subquery_col < ? )

Questions:

  • How to implement my desired query with DBIC?
  • I found differences between SQLite and PostgreSQL using the same DBIC syntax. Might that be a bug? I might try to further reduce the example in order to post a potential bug report to the DBIC mailing list.

If you are curious, what my real query is about: My app is for collecting recipes and making food plans. I need to fetch all purchase list items, prefetch their article and calculate a sum for each list item of how many servings require this item to be purchased. Example row: purchase list item 100kg apples is related to article apple and has a sum of 25 servings from two dishes with 10 and 15 servings each. For preordering only purchase list items which exceed a given number of servings shall be displayed.

like image 992
Daniel Böhmer Avatar asked Oct 19 '25 11:10

Daniel Böhmer


1 Answers

Thanks to barubary on Mastodon who helped me find a solution.

The problem was with the as and alias keys. This is even much simpler and works:

my $rs = $schema->resultset('CD');

my $subquery_rs = $rs->search(
    undef,
    {       
        '+select' => [
            {     
                '' => $rs->search_related('tracks')    # in reality I use correlate()
                  ->count_rs->as_query,
                -as => 'correlated_subquery_col'
            },
        ],
    }
)->as_subselect_rs;

$subquery_rs->search(
    {
        correlated_subquery_col => { '<' => 3 },
    },
    {
        prefetch   => 'artist',
        '+columns' => { correlated_subquery_key => 'correlated_subquery_col' },
    }
)->all;

It produces this SQL:

SELECT me.cdid, me.artist, me.title, me.year, me.genreid, me.single_track, me.correlated_subquery_col, artist.artistid, artist.name, artist.rank, artist.charfield 
  FROM ( 
    SELECT me.cdid, me.artist, me.title, me.year, me.genreid, me.single_track, ( 
        SELECT COUNT( * ) 
          FROM cd me 
          JOIN track tracks 
            ON tracks.cd = me.cdid
       ) AS correlated_subquery_col 
      FROM cd me
   ) me 
  JOIN artist artist 
    ON artist.artistid = me.artist 
WHERE correlated_subquery_col < '3'
like image 77
Daniel Böhmer Avatar answered Oct 21 '25 02:10

Daniel Böhmer