Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to unnest a 2d array into a 1d array quickly in PostgreSQL?

I have a really large array that have I computed with Apache Madlib and I would like to apply an operation to each single array in that 2d array.

I have found code that can help me unnest it from this related answer. However, the code is miserably slow on this really large 2d array (150,000+ 1d float arrays). While unnest() only takes a few seconds to run, even after waiting for several minutes the code has not completed.

Surely, there must be a faster way to unnest the large 2d array into smaller 1d arrays? Bonus point if that solution uses Apache Madlib. I did find one lead buried in the documentation called deconstruct_2d_array, however, when I try to call that function on the matrix, it fails with the following error:

ERROR: Function "deconstruct_2d_array(double precision[])": Invalid type conversion. Internal composite type has more elements than backend composite type.

like image 990
Skylion Avatar asked Oct 22 '25 03:10

Skylion


1 Answers

The function you found in my old answer does not scale well for big arrays. I never thought of arrays your size, which should probably be a set (a table) instead.

Be that as it may, this PL/pgSQL function can replace the one in the referenced answer. Requires Postgres 9.1 or later.

CREATE OR REPLACE FUNCTION unnest_2d_1d(ANYARRAY, OUT a ANYARRAY)
  RETURNS SETOF ANYARRAY
  LANGUAGE plpgsql IMMUTABLE STRICT AS
$func$
BEGIN
   FOREACH a SLICE 1 IN ARRAY $1 LOOP
      RETURN NEXT;
   END LOOP;
END
$func$;

40x faster in my test on a big 2d-array in Postgres 9.6.

STRICT to avoid an exception for NULL input (as commented by IamIC):

ERROR: FOREACH expression must not be null

like image 178
Erwin Brandstetter Avatar answered Oct 23 '25 16:10

Erwin Brandstetter