ARRAY_AGG function omits NULL by definition.
The requirement is to generate array with the following format:
SELECT [NULL, 'a1', 'a2', 'a3', NULL]
-- [   undefined,   "a1",   "a2",   "a3",   undefined ]
from source data using ARRAY_AGG function and nullifying all entries not starting with letter a:
CREATE OR REPLACE TABLE tab(id INT, val TEXT) AS
SELECT * FROM VALUES  (1,'b0'), (2,'a1'), (3,'a2'), (4,'a3'), (5,'b1');
Attempt 1:
SELECT ARRAY_AGG(CASE WHEN val LIKE 'a%' THEN val END) 
       WITHIN GROUP (ORDER BY id) AS result,
       result = [NULL, 'a1', 'a2', 'a3', NULL] AS is_output_as_requested
FROM  tab;
-- [   "a1",   "a2",   "a3" ]   FALSE
Result: ARRAY_AGG works as intended and skips NULL; The output does not match the requirement.
Attempt 2:
SELECT ARRAY_AGG(CASE WHEN val LIKE 'a%' THEN val ELSE PARSE_JSON('null') END) 
   WITHIN GROUP (ORDER BY id) AS result,
   result = [NULL, 'a1', 'a2', 'a3', NULL] AS is_output_as_requested
FROM tab;
--  [   "a1",   "a2",   "a3" ]   FALSE
Rationale: CASE expression output must have the same data type, therefore implicit conversion: TO_CHAR(PARSE_JSON('null')) <=> NULL
Attempt 3:
SELECT ARRAY_AGG(CASE WHEN val LIKE 'a%' THEN val::VARIANT ELSE PARSE_JSON('null') END) 
      WITHIN GROUP (ORDER BY id)AS result,
      result = [NULL, 'a1', 'a2', 'a3', NULL] AS is_output_as_requested
FROM tab;
--  [   null,   "a1",   "a2",   "a3",   null ]  FALSE
Almost what is requested, though not quite. Type of the first and last element is NULL_VALUE.
I am aware of VARIANT NULL. I am seeking for a way to "preserve NULLs" in exactly the format as requested for ARRAY_AGG.
The solution must be a standalone SQL expression (no Snowflake Scripting/UDFs).
It is possible to achieve it by aggregating arrays instead of scalar values:
SELECT ARRAY_FLATTEN(ARRAY_AGG(CASE WHEN val LIKE 'a%' THEN [val] ELSE [NULL] END)
       WITHIN GROUP (ORDER BY id)) AS result,
       result = [NULL, 'a1', 'a2', 'a3', NULL] AS is_output_as_requested 
FROM tab;
-- [   undefined,   "a1",   "a2",   "a3",   undefined ] TRUE
Rationale: ARRAY_AGG does not skip [NULL].
Step 1:
SELECT ARRAY_AGG(CASE WHEN val LIKE 'a%' THEN [val] ELSE [NULL] END)
       WITHIN GROUP (ORDER BY id) AS result
FROM tab;
-- [ [undefined], [ "a1"], ["a2"], ["a3"], [undefined] ]
Step 2:
SELECT ARRAY_FLATTEN(...)
-- [ undefined, "a1", "a2", "a3", undefined ]
In general:
SELECT ARRAY_FLATTEN(ARRAY_AGG([some_col]))
FROM tab;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With