Looping through arrays in PL/pgSQL
The PL/pgSQL FOREACH
loop brings dedicated syntax for looping over the contents of an array.
Overview
Note: See array_lower(), array_upper(), array_ndims() and cardinality() for descriptions of the functions that the following account mentions. It also mentions "row-major order". See Joint semantics, within the section "Functions for reporting the geometric properties of an array", for an explanation of this term. Syntax and semantics shows where, in the FOREACH
loop header, the SLICE
keyword is used.
-
When the operand of the
SLICE
clause is0
, and for the general case where the iterand array has any number of dimensions, YSQL assigns its successive values, in row-major order, to the loop iterator. Here, its effect is functionally analogous to that ofunnest()
. -
For the special case where the iterand array is one-dimensional, the
FOREACH
loop is useful only when the operand of theSLICE
clause is0
. In this use, it is a syntactically more compact way to achieve the effect that is achieved with aFOR var IN
loop like this:for var in array_lower(iterand_arr, 1)..array_upper(iterand_arr, 1) loop ... iterand_arr[var] ... end loop;
-
When the operand of the
SLICE
clause is greater than0
, and when the dimensionality of the iterand array is greater than1
, theFOREACH
loop provides functionality thatunnest()
cannot provide. Briefly, when the iterand array hasn
dimensions and the operand of theSLICE
clause iss
, YSQL assigns slices (that is, subarrays) of dimensionalitys
to the iterator. The values in such a slice are those from the iterand array that remain when the distinct values of the first(n - s)
indexes are used to drive the iteration. These two pseudocode blocks illustrate the idea:-- In this example, the SLICE operand is 1. -- As a consequence, array_ndims(iterator_array) is 1. -- Assume that array_ndims(iterand_arr) is 4. -- There are therefore (4 - 1) = 3 nested loops in this pseudocode. for i in array_lower(iterand_arr, 1)..array_upper(iterand_arr, 1) loop for j in array_lower(iterand_arr, 2)..array_upper(iterand_arr, 2) loop for k in array_lower(iterand_arr, 3)..array_upper(iterand_arr, 3) loop the (i, j, k)th iterator_array is set to iterand_arr[i][j][k][ for all values the remaining 4th index ] end loop; end loop; end loop;
-- In this example, the SLICE operand is 3. -- As a consequence, array_ndims(iterator_array) is 3. -- Assume that array_ndims(iterand_arr) is 4. -- There is therefore (4 - 3) = 1 nested loop in this pseudocode. for i in array_lower(iterand_arr, 1)..array_upper(iterand_arr, 1) loop the (i)th iterator_array is set to iterand_arr[i][ for all values the remaining 2nd, 3rd, and 4th indexes ] end loop;
The examples below clarify the behavior of FOREACH
.
Syntax and semantics
[ <<label>> ]
FOREACH var [ SLICE non_negative_integer_literal ] IN ARRAY expression LOOP
statements
END LOOP [ label ];
-
var
must be explicitly declared before theFOREACH
loop. -
The operand of the optional
SLICE
clause must be a non-negativeint
literal. -
Assume that
expression
has the data typesome_type[]
. - WhenSLICE 0
is used,var
must be declared assome_type
.- When the
SLICE
clause's operand is positive,var
must be declared assome_type[]
.
- When the
-
SLICE 0
has the same effect as omitting theSLICE
clause. -
When
SLICE 0
is used, or theSLICE
clause is omitted, YSQL assigns each in turn of the array's values, visited in row-major order, tovar
. -
When the
SLICE
clause's operand is positive, then YSQL assigns successive slices of the iterand array tovar
according to the following rule:- The extracted slices all have the same dimensionality given by this equality:
array_ndims(iterator_array) = slice_operand
. - The number of extracted slices is given by this equality:
number_of_slices = cardinality(iterand_array)/cardinality(iterator_array)
. - The value of the
SLICE
operand must not exceedarray_ndims(iterator_array)
. - When the value of the
SLICE
operand is equal toarray_ndims(iterator_array) - 1
, theFOREACH
loop produces just a single iterator array value, and this is identical to the iterand array.
- The extracted slices all have the same dimensionality given by this equality:
-
The useful range for the
SLICE
operand is therefore0..(array_ndims(iterator_array) - 1)
.
Looping over the values in an array without the SLICE keyword
This loop is functionally identical to the FOR var IN
loop. However, in the FOR
loop, YSQL automatically defines var
with the type int
so that its scope is limited to the loop; but in the FOREACH
loop, var
must be explicitly declared, as noted in the "Syntax and semantics" section above.
do $body$
declare
arr1 int[] := array[1, 2];
arr2 int[] := array[3, 4, 5];
var int;
begin
<<"FOREACH eaxmple">>
foreach var in array arr1||arr2 loop
raise info '%', var;
end loop "FOREACH eaxmple";
end;
$body$;
It shows this (after manually stripping the "INFO:" prompt):
1
2
3
4
5
The ||
operator is used only to emphasize that the iterand can be any expression whose data type is an array.
The next loop shows these things of note:
- The iterand array is multidimensional.
- The array type is "rt[]" where rt is a user-defined "row" type.
- The syntax spot where "var" is used above need not be occupied by a single variable. Rather, "f1" and "f2" are used, to correspond to the fields in "rt".
- The
FOREACH
loop is followed by a "cursor" loop whoseSELECT
statement usesunnest()
. - The
FOREACH
loop is more terse than the "cursor" loop. In particular, you can use the pair of declared variables "f1" and "f2" without any fuss (just as you could have used a single variable "r" of type "rt" without any fuss) as the iterator. YSQL looks after the proper assignment in both cases. But when you useunnest()
, you have to look after this yourself.
create type rt as (f1 int, f2 text);
do $body$
declare
a1 rt[] := array[(1, 'dog'), (2, 'cat')];
a2 rt[] := array[(3, 'ant'), (4, 'rat')];
arr rt[] := array[a1, a2];
f1 int;
f2 text;
begin
raise info '';
raise info 'FOREACH';
foreach f1, f2 in array arr loop
raise info '% | %', f1::text, f2;
end loop;
raise info '';
raise info 'unnest()';
for f1, f2 in (
with v as (
select unnest(arr) as r)
select (r).f1, (r).f2 from v)
loop
raise info '% | %', f1::text, f2;
end loop;
end;
$body$;
It shows this:
FOREACH
1 | dog
2 | cat
3 | ant
4 | rat
unnest()
1 | dog
2 | cat
3 | ant
4 | rat
This shows that this use of the FOREACH
loop (with an implied 0
as the SLICE
clause's operand) is functionally equivalent to unnest()
.
Looping over the contents of a multidimensional array taking advantage of a non-zero SLICE operand
First, store a three-dimensional two-by-two-by-two array in a ::text[]
field in a single-row table. Each of the subsequent DO
blocks uses it.
create table t(k int primary key, arr text[] not null);
insert into t(k, arr) values(1, '
{
{
{001,002},
{003,004}
},
{
{005,006},
{007,008}
}
}'::text[]);
Next, show the outcome when a bad value is used for the SLICE
operand:
do $body$
declare
arr constant text[] not null := (select arr from t where k = 1);
slice_iterator text[] not null := '{?}';
begin
raise info 'array_ndims(arr): %', array_ndims(arr)::text;
foreach slice_iterator slice 4 in array arr loop
raise info '%', slice_iterator::text;
end loop;
end;
$body$;
It shows array_ndims(arr): 3
and then it reports this error:
2202E: slice dimension (4) is out of the valid range 0..3
This test confirms that the value of the SLICE
operand must not exceed the iterand array's dimensionality.
As has been seen, SLICE 0
(or equivalently omitting the SLICE
clause) scans the array values in row-major order. The next test, with SLICE 3
, demonstrates the meaning when the SLICE
operand is equal to the iterand array's dimensionality.
do $body$
declare
arr constant text[] not null := (select arr from t where k = 1);
slice_iterator text[] not null := '{?}';
n int not null := 0;
begin
raise info 'FOREACH SLICE 3';
n := 0;
foreach slice_iterator slice 3 in array arr loop
assert
(array_ndims(slice_iterator) = 3) and
(slice_iterator = arr) ,
'assert failed';
n := n + 1;
raise info '% | %', n::text, slice_iterator::text;
end loop;
end;
$body$;
It shows this:
FOREACH SLICE 3
1 | {{{001,002},{003,004}},{{005,006},{007,008}}}
The FOREACH
loop generates just a single iterator slice. And, as the assert
shows, this is identical to the iterand array. In other words, setting the SLICE
operand to be equal to the iterand array's dimensionality, while the result is well-defined, is not useful. So, using this example iterand array, the useful range for the SLICE
operand is 0..2
.
The next test uses SLICE 2
:
do $body$
declare
arr constant text[] not null := (select arr from t where k = 1);
slice_iterator text[] not null := '{?}';
n int not null := 0;
begin
raise info 'FOREACH SLICE 2';
n := 0;
foreach slice_iterator slice 2 in array arr loop
assert (array_ndims(slice_iterator) = 2), 'assert failed';
n := n + 1;
raise info '% | %', n::text, slice_iterator::text;
end loop;
end;
$body$;
It shows this:
FOREACH SLICE 2
1 | {{001,002},{003,004}}
2 | {{005,006},{007,008}}
As the assert
shows, the operand of the SLICE
operator determines the dimensionality of the iterator slices.
The next test uses SLICE 1
:
do $body$
declare
arr constant text[] not null := (select arr from t where k = 1);
slice_iterator text[] not null := '{?}';
n int not null := 0;
begin
raise info 'FOREACH SLICE 1';
n := 0;
foreach slice_iterator slice 1 in array arr loop
assert (array_ndims(slice_iterator) = 1), 'assert failed';
n := n + 1;
raise info '% | %', n::text, slice_iterator::text;
end loop;
end;
$body$;
It shows this:
FOREACH SLICE 1
1 | {001,002}
2 | {003,004}
3 | {005,006}
4 | {007,008}
Once again, the assert
shows that the operand of the SLICE
operator determines the dimensionality of the iterator slices.
The last FOREACH
test uses SLICE 0
. Notice that, now, the iterator is declared as the scalar text
variable "var":
do $body$
declare
arr constant text[] not null := (select arr from t where k = 1);
var text not null := '?';
n int not null := 0;
begin
raise info 'FOREACH SLICE 0';
n := 0;
foreach var slice 0 in array arr loop
n := n + 1;
raise info '% | %', n::text, var;
end loop;
end;
$body$;
It shows this:
FOREACH SLICE 0
1 | 001
2 | 002
3 | 003
4 | 004
5 | 005
6 | 006
7 | 007
8 | 008
This is functionally equivalent to unnest()
as the final test shows:
do $body$
<<b>>declare
arr constant text[] not null := (select arr from t where k = 1);
var text not null := '?';
n int not null := 0;
begin
raise info 'unnest()';
n := 0;
for b.n, b.var in (
with
v1 as (
select unnest(arr) as var),
v2 as (
select
v1.var,
row_number() over(order by v1.var) as n
from v1)
select v2.n, v2.var from v2)
loop
n := n + 1;
raise info '% | %', n::text, var;
end loop;
end b;
$body$;
It shows this:
unnest()
2 | 001
3 | 002
4 | 003
5 | 004
6 | 005
7 | 006
8 | 007
9 | 008
Using FOREACH to iterate over the elements in an array of DOMAIN values
You need to be aware of some special considerations to implement this scenario. Using FOREACH with an array of DOMAINs, within the dedicated section Using an array of DOMAIN values explains what you need to know.
Using a wrapper PL/pgSQL table function to expose the SLICE operand as a formal parameter
The fact that the SLICE
operand must be a literal means that there are only two ways two parameterize this—and neither is satisfactory for real application code. Each uses a table function whose input is the iterand array and the value for the SLICE
operand, and whose output is a SETOF
iterator array values.
- The first approach is to encapsulate some particular range of
SLICE
operand values in an ordinary statically defined function that uses aCASE
statement to select theFOREACH
loop that has the requiredSLICE
operand literal. This is unsatisfactory because you have to decide the range ofSLICE
operand values that you'll support up front. - The second approach overcomes the limitation of the up front determination of the supported range of
SLICE
operand values by encapsulating code in, a statically defined function, that in turn dynamically generates a function with the requiredFOREACH
loop andSLICE
operand value and that then invokes it dynamically. This is unsatisfactory because it's some effort to implement and test such an approach. But it's unsatisfactory mainly because of the performance cost that dynamic generation and execution brings.
However, the requirements specification for real application code is unlikely to need more than one, or possibly just a few, specific values for the SLICE
operand. Therefore, in overwhelming majority of practically important use cases, you can write exactly the code you need where you need it.
The code that follows uses the first approach. It's included here because it demonstrates a generically valuable PL/pgSQL programming technique: user-defined functions and procedures with polymorphic formal parameters (in this case anyarray
and anyelement
). The examples also use assert
statements to confirm that the expected relationships hold between these quantities:
- the dimensionality of the iterand array
- the value of the
SLICE
operand - the lengths along the iterand array's dimensions
- the cardinalities of the iterand array and the iterator arrays
- the number of returned iterator values.
Here is the basic encapsulation. It's hard-coded to handle values for the SLICE
operand in the range 0..4
.
Recall that the iterator for SLICE 0
is a scalar and that the iterators for other values of the SLICE
operand are arrays. And recall that a pair of functions with the same definitions of the input formal parameters cannot be overload-distinguished by the data type of their return values. For this reason, the encapsulation of the FOREACH
loop for SLICE 0
is a dedicated function with just one input formal parameter: the iterand array. And the encapsulation of the FOREACH
loop for other values of the SLICE
operand is a second function with two input formal parameters: the iterand array and the value of the SLICE
operand. Here they are:
-- First overload
create function array_slices(arr in anyarray)
returns table(ret anyelement)
language plpgsql
as $body$
declare
no_of_values int not null := 0;
begin
-- "slice 0" means the same
-- as omitting the "slice" clause.
foreach ret slice 0 in array arr
loop
no_of_values := no_of_values + 1;
return next;
end loop;
assert
(no_of_values = cardinality(arr)),
'array_slices 1st overload: no_of_values assert failed';
end;
$body$;
And:
-- Second overload
create function array_slices(arr in anyarray, slice_operand in int)
returns table(ret anyarray)
language plpgsql
as $body$
declare
no_of_values int not null := 0;
lengths_product int not null := 0;
begin
case slice_operand
when 1 then
lengths_product := array_length(arr, 4);
foreach ret slice 1 in array arr
loop
no_of_values := no_of_values + 1;
assert
(array_ndims(ret) = 1) and
(cardinality(ret) = lengths_product) ,
'assert failed';
return next;
end loop;
assert
(no_of_values = cardinality(arr)/lengths_product),
'array_slices 2nd overload: no_of_values assert #1 failed';
when 2 then
lengths_product := array_length(arr, 4) *
array_length(arr, 3);
foreach ret slice 2 in array arr
loop
no_of_values := no_of_values + 1;
assert
(array_ndims(ret) = 2) and
(cardinality(ret) = lengths_product) ,
'assert failed';
return next;
end loop;
assert
(no_of_values = cardinality(arr)/lengths_product),
'array_slices 2nd overload: no_of_values assert #2 failed';
when 3 then
lengths_product := array_length(arr, 4) *
array_length(arr, 3) *
array_length(arr, 2);
foreach ret slice 3 in array arr
loop
no_of_values := no_of_values + 1;
assert
(array_ndims(ret) = 3) and
(cardinality(ret) = lengths_product) ,
'assert failed';
return next;
end loop;
assert
(no_of_values = cardinality(arr)/lengths_product),
'array_slices 2nd overload: no_of_values assert #3 failed';
when 4 then
lengths_product := array_length(arr, 4) *
array_length(arr, 3) *
array_length(arr, 2) *
array_length(arr, 1);
foreach ret slice 4 in array arr
loop
no_of_values := no_of_values + 1;
assert
(array_ndims(ret) = 4) and
(cardinality(ret) = lengths_product) ,
'assert failed';
return next;
end loop;
assert
(no_of_values = cardinality(arr)/lengths_product),
'array_slices 2nd overload: no_of_values assert #4 failed';
else
raise exception 'slice_operand > 4 not supported';
end case;
end;
$body$;
You can see that each leg of the CASE
is "generated" formulaically—albeit manually—by following a pattern that could be parameterized. You can use these encapsulations for iterand arrays of any dimensionality. But you must take responsibility for following the rule that the value of the SLICE
operand must fall within the acceptable range. Otherwise, you'll get the error that was demonstrated above:
2202E: slice dimension % is out of the valid range 0..%
Here is the test harness. Both this procedure and the function that generates the to-be-tested iterand array are hard-coding for a dimensionality of 4
.
-- Exercise each of the meaningful calls to array_slices().
--
-- You cannot declare local variables as "anyelement" or "anyarray".
-- (The attempt causes a compilation error). It's obvious why.
-- It's the caller's responsibility to determine the
-- real type by using appropriate actual arguments.
-- "val" (scalar) and "slice" (array) are needed as FOREACH loop runners.
-- "in out" is used to avoid the nominal performance penalty
-- of extra copying brought by plain "out".
-- The caller has no interest in whatever values they have
-- on return from this procedure.
--
-- NOTE: while you _can_ declare a local variable as
-- "some_formal%type", you _cannot_ use that mechanism to declare
-- a scalar with the data type that defines an array when
-- all you have to anchor "%type" to is the array.
--
create procedure test_array_slices(
-- The "real" formal.
arr in anyarray,
-- used as "local varables
val in out anyelement,
slice in out anyarray)
language plpgsql
as $body$
declare
arr_ndims constant int := array_ndims(arr);
begin
assert
(arr_ndims = 4),
'assert failed: test_array_slices() requires a 4-d array';
raise info '%', array_dims(arr);
declare
len_1 constant int := array_length(arr, 1);
len_2 constant int := array_length(arr, 2);
len_3 constant int := array_length(arr, 3);
len_4 constant int := array_length(arr, 4);
expected_slice_cardinalities constant int[] not null :=
array[
len_4,
len_4*len_3,
len_4*len_3,
len_4*len_3*len_2,
len_4*len_3*len_2*len_1];
slice_cardinality int not null := 0;
-- val anyelement not null := '?';
-- slice anyarray not null := '{?}';
begin
raise info ''; raise info 'slice_operand: %', 0;
for val in (select array_slices(arr)) loop
raise info '%', val::text;
end loop;
for slice_operand in 1..arr_ndims loop
raise info ''; raise info 'slice_operand: %', slice_operand;
for slice in (select array_slices(arr, slice_operand)) loop
slice_cardinality := cardinality(slice);
assert
(array_ndims(slice) = slice_operand) ,
(slice_cardinality = expected_slice_cardinalities[slice_operand]) ,
'assert failed.';
raise info '%', slice::text;
if slice_operand = arr_ndims then
assert (slice = arr), 'assert (slice = arr) failed';
end if;
end loop;
end loop;
end;
end;
$body$;
Here is a function to generate a four-dimensional array. Notice that the actual argument for the "lengths" formal parameter must be a one-dimensional int[]
array with four values. These specify the lengths along each of the output array's dimensions.
create function four_d_array(lengths in int[])
returns text[]
language plpgsql
as $body$
declare
lengths_ndims constant int := array_ndims(lengths);
lengths_cardinality constant int := cardinality(lengths);
begin
assert
(lengths_ndims = 1) and
(lengths_cardinality = 4) ,
'assert failed: four_d_array() creates only a 4-d array.';
declare
-- Take the default for array_fill's optional 2nd formal:
-- all lower bounds are 1.
arr text[] not null := array_fill('00'::text, lengths);
begin
-- For readability of the results, define the created array's values so that,
-- when scanned in row-major order, they are seen to be a dense series
-- that increases in even steps, 001, 002, 003, and so on.
declare
n int not null := 0;
begin
for i1 in 1..lengths[1] loop
for i2 in 1..lengths[2] loop
for i3 in 1..lengths[3] loop
for i4 in 1..lengths[4] loop
n := n + 1;
arr[i1][i2][i3][i4] := ltrim(to_char(n, '009'));
end loop;
end loop;
end loop;
end loop;
end;
assert
(array_ndims(arr) = lengths_cardinality),
'assert failed.';
-- Sanity check. Include to demonstrate the useful
-- terseness of the FOREACH loop.
declare
product int not null := 1;
len int not null := 0;
begin
foreach len in array lengths loop
product := product*len;
end loop;
assert
(cardinality(arr) = product) ,
'assert failed.';
end;
return arr;
end;
end;
$body$;
And here is one example test invocation:
do $body$
declare
arr constant text[] not null := four_d_array('{2, 2, 2, 2}'::int[]);
dummy_var text := '?';
dummy_arr text[] := '{/}';
begin
call test_array_slices(arr, dummy_var, dummy_arr);
end;
$body$;
It produces this result:
[1:2][1:2][1:2][1:2]
slice_operand: 0
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
slice_operand: 1
{001,002}
{003,004}
{005,006}
{007,008}
{009,010}
{011,012}
{013,014}
{015,016}
slice_operand: 2
{{001,002},{003,004}}
{{005,006},{007,008}}
{{009,010},{011,012}}
{{013,014},{015,016}}
slice_operand: 3
{{{001,002},{003,004}},{{005,006},{007,008}}}
{{{009,010},{011,012}},{{013,014},{015,016}}}
slice_operand: 4
{{{{001,002},{003,004}},{{005,006},{007,008}}},{{{009,010},{011,012}},{{013,014},{015,016}}}}