Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres jsonb query for dynamic values

In the users table I have a jsob column experience with following json structure:

[
    {
        "field": "devops",
        "years": 9
    },
    {
        "field": "backend dev",
        "years": 7
    } 
... // could be N number of objects with different values
]

Business requirement

Client can request for people with experience in any field and with their respective years experience in each

This is an example query

SELECT * FROM users
WHERE
jsonb_path_exists(experience, '$[*] ? (@.field == "devops" && @.years > 5)') and
jsonb_path_exists(experience, '$[*] ? (@.field == "backend dev" && @.years > 5)')
LIMIT 3;

Issue

Lets say if I get a request for

[
  { field: "devops", years: 5 }, 
  { field: "java", years: 6 }, 
  { field: "ui/ux", years: 2 }] // and so on

How do I dynamically create a query without worrying about sql injection?

Techstack

  • Nodejs
  • Typescript
  • TypeORM
  • Postgres
like image 826
Karan Kumar Avatar asked Oct 22 '25 07:10

Karan Kumar


2 Answers

Index

First of all, you want index support. I suggest a jsonb_path_ops index like:

CREATE INDEX users_experience_gin_idx ON users USING gin (experience jsonb_path_ops);

See:

  • Index for finding an element in a JSON array

Query

And a query that can tap into that index (100 % equivalent to your original):

SELECT *
FROM   users
WHERE  experience @? '$[*] ? (@.field == "devops" && @.years > 5 )'
AND    experience @? '$[*] ? (@.field == "backend dev" && @.years > 5)'
LIMIT  3;

Requires Postgres 12 or later, where the SQL/JSON path language was added.

Index support is bound to operators in Postgres. The operator @? is the equivalent of jsonb_path_exists(). See:

  • Find rows containing a key in a JSONB array of records

Generate query dynamically

SELECT 'SELECT * FROM users
WHERE  experience @? '
       || string_agg(quote_nullable(format('$[*] ? (@.field == %s && @.years > %s)'
                                         , f->'field'
                                         , f->'years')) || '::jsonpath'
                   , E'\nAND    experience @? ')
       || E'\nLIMIT  3'
FROM   jsonb_array_elements('[{"field": "devops", "years": 5 }, 
                              {"field": "java", "years": 6 }, 
                              {"field": "ui/ux", "years": 2 }]') f;

Generates a query of the above form:

SELECT * FROM users
WHERE  experience @? '$[*] ? (@.field == "devops" && @.years > 5)'::jsonpath
AND    experience @? '$[*] ? (@.field == "java" && @.years > 6)'::jsonpath
AND    experience @? '$[*] ? (@.field == "ui/ux" && @.years > 2)'::jsonpath
LIMIT  3;

Full automation

How do I dynamically create a query without worrying about sql injection?

Put above query generation into a PL/pgSQL function to execute dynamically:

CREATE OR REPLACE FUNCTION f_users_with_experience(_filter_arr jsonb, _limit int = 3)
  RETURNS SETOF users
  LANGUAGE plpgsql PARALLEL SAFE STABLE STRICT AS
$func$
DECLARE
   _sql text;
BEGIN
   -- assert (you may want to be stricter?)
   IF jsonb_path_exists (_filter_arr, '$[*] ? (!exists(@.field) || !exists(@.years))') THEN
      RAISE EXCEPTION 'Parameter $2 (_filter_arr) must be a JSON array with keys "field" and "years" in every object. Invalid input was: >>%<<', _filter_arr;
   END IF;

   -- generate query string
   SELECT INTO _sql
'SELECT * FROM users
WHERE  experience @? '
       || string_agg(quote_nullable(format('$[*] ? (@.field == %s && @.years > %s)'
                                         , f->'field'
                                         , f->'years'))
                   , E'\nAND    experience @? ')
       || E'\nLIMIT   ' || _limit
   FROM   jsonb_array_elements(_filter_arr) f;

   -- execute
   IF _sql IS NULL THEN
      RAISE EXCEPTION 'SQL statement is NULL. Should not occur!';
   ELSE
   -- RAISE NOTICE '%', _sql;     -- debug first if in doubt
      RETURN QUERY EXECUTE _sql;
   END IF;
END
$func$;

Call:

SELECT * FROM f_users_with_experience('[{"field": "devops", "years": 5 }, 
                                      , {"field": "backend dev", "years": 6}]');

Or with a different LIMIT:

SELECT * FROM f_users_with_experience('[{"field": "devops", "years": 5 }]', 123);

db<>fiddle here

You should be comfortable with PL/pgSQL to work with this and understand it.

SQL injection is impossible because ...

  1. valid JSON input is enforced
  2. JSON values are concatenated with original JSON double-quotes.
  3. Most importantly, each generated jsonpath value is single-quoted with quote_nullable().

While being at the topic of SQL/JSON path expressions I use one to assert valid input:

jsonb_path_exists (_filter_arr, '$[*] ? (!exists(@.field) || !exists(@.years))')

Checks every object in the JSON array and whether one of the two required keys (field, years) is missing.

like image 83
Erwin Brandstetter Avatar answered Oct 24 '25 22:10

Erwin Brandstetter


This is a parameterized query so more or less injection safe. qualifies scalar subquery calculates whether experience satisfies all request items. The parameters are $1 (the jsonb array of request parameters) and $2 (the limit value). You may need to change their syntax depending on the flavour of your environment.

select t.* from 
(
  select u.*,
    (
      select count(*) = jsonb_array_length($1)
      from jsonb_array_elements(u.experience) ej -- jsonb list of experiences 
      inner join jsonb_array_elements($1) rj     -- jsonb list of request items
         on ej ->> 'field' =  rj ->> 'field'
        and (ej ->> 'years')::numeric >= (rj ->> 'years')::numeric
    ) as qualifies
 from users as u
) as t
where t.qualifies
limit $2;

Some explanation

The logic of the qualifies subquery is this: first 'normalize' the experience and request jsonb arrays into 'tables', then inner join them on the target condition (which is field_a = field_b and years_a >= years_b in this case) and count how many of them match. If the count is equal to the number of request items (i.e. count(*) = jsonb_array_length($1)) then all of them are satisfied and so experience qualifies.
Thus no dynamic SQL is necessary. I think that this approach may be reusable too.

like image 32
Stefanov.sm Avatar answered Oct 24 '25 22:10

Stefanov.sm



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!