A few days ago I stumbled on a PostgreSQL query I didn’t really understand. I asked a question on Stack Overflow and learned something I would have known if only I had read the manual…
Assume that we have a table with a jsonb
column like this:
# create temporary table test_table( name text, data jsonb ); # insert into test_table values ('name1', '{"a": 1, "b": 2}'), ('name2', '{"c": 3, "d": 4, "e": 5}'); # table test_table; name | data -------+-------------------------- name1 | {"a": 1, "b": 2} name2 | {"c": 3, "d": 4, "e": 5} (2 rows)
The query in question lookes like this:
select * from test_table cross join jsonb_each(test_table.data);
Can you guess its result? If not, read on.
The jsonb_each
by itself converts a jsonb
value to a set of rows with columns key
and value
:
# select jsonb_each('{"a": 1, "b": 2}'::jsonb); jsonb_each ------------ (a,1) (b,2) (2 rows)
Let us make a few notes here. First of all, the column names are not visible above because of the way composite types are printed in PostgreSQL. (You can say e.g. select key, value from
jsonb_each('{"a": 1, "b": 2}'::jsonb);
– or even select * from...
– to see them.) Note also that key
is of type text
, but value
is of type jsonb
– there is also jsonb_each_text returning both columns of type text. Finally explicit casting to jsonb
is not necessary – I added it only for clarity.
Now the way joins work is that both sides of the join
are calculated independently, and so we cannot e.g. refer to a row from the left-hand side of a join in the expression on the right-hand side. (An exception to this – or rather, a way to break this rule – is the lateral
join.) What I missed, however, is another exception: for functions the lateral
key word is optional; the function’s arguments can contain references to columns provided by preceding FROM items in any case. (The details can be found here – look for LATERAL
.)
This means that what happens here is that the jsonb_each
is computed independently for each row of the test_table
and only then cross joined with it. In other words, we obtain this:
# select * from test_table cross join jsonb_each(test_table.data); name | data | key | value -------+--------------------------+-----+------- name1 | {"a": 1, "b": 2} | a | 1 name1 | {"a": 1, "b": 2} | b | 2 name2 | {"c": 3, "d": 4, "e": 5} | c | 3 name2 | {"c": 3, "d": 4, "e": 5} | d | 4 name2 | {"c": 3, "d": 4, "e": 5} | e | 5 (5 rows)
This is really handy, but not exactly what I expected. As usual, PostgreSQL turns out to be really, really clever.