In the users table I have a jsob column experience with following json structure:
[
{
"field": "devops",
"years": 9
},
{
"field": "backend dev",
"years": 7
}
... // could be N number of objects with different values
]
Business requirement
Client can request for people with experience in any field and with their respective years experience in each
This is an example query
SELECT * FROM users
WHERE
jsonb_path_exists(experience, '$[*] ? (@.field == "devops" && @.years > 5)') and
jsonb_path_exists(experience, '$[*] ? (@.field == "backend dev" && @.years > 5)')
LIMIT 3;
Lets say if I get a request for
[
{ field: "devops", years: 5 },
{ field: "java", years: 6 },
{ field: "ui/ux", years: 2 }] // and so on
How do I dynamically create a query without worrying about sql injection?
First of all, you want index support. I suggest a jsonb_path_ops index like:
CREATE INDEX users_experience_gin_idx ON users USING gin (experience jsonb_path_ops);
See:
And a query that can tap into that index (100 % equivalent to your original):
SELECT *
FROM users
WHERE experience @? '$[*] ? (@.field == "devops" && @.years > 5 )'
AND experience @? '$[*] ? (@.field == "backend dev" && @.years > 5)'
LIMIT 3;
Requires Postgres 12 or later, where the SQL/JSON path language was added.
Index support is bound to operators in Postgres. The operator @? is the equivalent of jsonb_path_exists(). See:
SELECT 'SELECT * FROM users
WHERE experience @? '
|| string_agg(quote_nullable(format('$[*] ? (@.field == %s && @.years > %s)'
, f->'field'
, f->'years')) || '::jsonpath'
, E'\nAND experience @? ')
|| E'\nLIMIT 3'
FROM jsonb_array_elements('[{"field": "devops", "years": 5 },
{"field": "java", "years": 6 },
{"field": "ui/ux", "years": 2 }]') f;
Generates a query of the above form:
SELECT * FROM users
WHERE experience @? '$[*] ? (@.field == "devops" && @.years > 5)'::jsonpath
AND experience @? '$[*] ? (@.field == "java" && @.years > 6)'::jsonpath
AND experience @? '$[*] ? (@.field == "ui/ux" && @.years > 2)'::jsonpath
LIMIT 3;
How do I dynamically create a query without worrying about sql injection?
Put above query generation into a PL/pgSQL function to execute dynamically:
CREATE OR REPLACE FUNCTION f_users_with_experience(_filter_arr jsonb, _limit int = 3)
RETURNS SETOF users
LANGUAGE plpgsql PARALLEL SAFE STABLE STRICT AS
$func$
DECLARE
_sql text;
BEGIN
-- assert (you may want to be stricter?)
IF jsonb_path_exists (_filter_arr, '$[*] ? (!exists(@.field) || !exists(@.years))') THEN
RAISE EXCEPTION 'Parameter $2 (_filter_arr) must be a JSON array with keys "field" and "years" in every object. Invalid input was: >>%<<', _filter_arr;
END IF;
-- generate query string
SELECT INTO _sql
'SELECT * FROM users
WHERE experience @? '
|| string_agg(quote_nullable(format('$[*] ? (@.field == %s && @.years > %s)'
, f->'field'
, f->'years'))
, E'\nAND experience @? ')
|| E'\nLIMIT ' || _limit
FROM jsonb_array_elements(_filter_arr) f;
-- execute
IF _sql IS NULL THEN
RAISE EXCEPTION 'SQL statement is NULL. Should not occur!';
ELSE
-- RAISE NOTICE '%', _sql; -- debug first if in doubt
RETURN QUERY EXECUTE _sql;
END IF;
END
$func$;
Call:
SELECT * FROM f_users_with_experience('[{"field": "devops", "years": 5 },
, {"field": "backend dev", "years": 6}]');
Or with a different LIMIT:
SELECT * FROM f_users_with_experience('[{"field": "devops", "years": 5 }]', 123);
db<>fiddle here
You should be comfortable with PL/pgSQL to work with this and understand it.
SQL injection is impossible because ...
jsonpath value is single-quoted with quote_nullable().While being at the topic of SQL/JSON path expressions I use one to assert valid input:
jsonb_path_exists (_filter_arr, '$[*] ? (!exists(@.field) || !exists(@.years))')
Checks every object in the JSON array and whether one of the two required keys (field, years) is missing.
This is a parameterized query so more or less injection safe. qualifies scalar subquery calculates whether experience satisfies all request items. The parameters are $1 (the jsonb array of request parameters) and $2 (the limit value). You may need to change their syntax depending on the flavour of your environment.
select t.* from
(
select u.*,
(
select count(*) = jsonb_array_length($1)
from jsonb_array_elements(u.experience) ej -- jsonb list of experiences
inner join jsonb_array_elements($1) rj -- jsonb list of request items
on ej ->> 'field' = rj ->> 'field'
and (ej ->> 'years')::numeric >= (rj ->> 'years')::numeric
) as qualifies
from users as u
) as t
where t.qualifies
limit $2;
Some explanation
The logic of the qualifies subquery is this: first 'normalize' the experience and request jsonb arrays into 'tables', then inner join them on the target condition (which is field_a = field_b and years_a >= years_b in this case) and count how many of them match. If the count is equal to the number of request items (i.e. count(*) = jsonb_array_length($1)) then all of them are satisfied and so experience qualifies.
Thus no dynamic SQL is necessary. I think that this approach may be reusable too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With