2019-10-28 PostgreSQL, jsonb each and cross joins

A few days ago I stumbled on a PostgreSQL query I didn’t really understand. I asked a question on Stack Overflow and learned something I would have known if only I had read the manual…

Assume that we have a table with a jsonb column like this:

# create temporary table test_table(
  name text,
  data jsonb
);
# insert into test_table values ('name1', '{"a": 1, "b": 2}'), ('name2', '{"c": 3, "d": 4, "e": 5}');
# table test_table;
 name  |           data
-------+--------------------------
 name1 | {"a": 1, "b": 2}
 name2 | {"c": 3, "d": 4, "e": 5}
(2 rows)

The query in question lookes like this:

select * from test_table cross join jsonb_each(test_table.data);

Can you guess its result? If not, read on.

The jsonb_each by itself converts a jsonb value to a set of rows with columns key and value:

# select jsonb_each('{"a": 1, "b": 2}'::jsonb);
 jsonb_each 
------------                                                                                                                                                                                  
 (a,1)                                                                                                                                                                                        
 (b,2)                                                                                                                                                                                        
(2 rows)                                                                                                                                                                                      

Let us make a few notes here. First of all, the column names are not visible above because of the way composite types are printed in PostgreSQL. (You can say e.g. select key, value from jsonb_each('{"a": 1, "b": 2}'::jsonb); – or even select * from... – to see them.) Note also that key is of type text, but value is of type jsonb – there is also jsonb_each_text returning both columns of type text. Finally explicit casting to jsonb is not necessary – I added it only for clarity.

Now the way joins work is that both sides of the join are calculated independently, and so we cannot e.g. refer to a row from the left-hand side of a join in the expression on the right-hand side. (An exception to this – or rather, a way to break this rule – is the lateral join.) What I missed, however, is another exception: for functions the lateral key word is optional; the function’s arguments can contain references to columns provided by preceding FROM items in any case. (The details can be found here – look for LATERAL.)

This means that what happens here is that the jsonb_each is computed independently for each row of the test_table and only then cross joined with it. In other words, we obtain this:

# select * from test_table cross join jsonb_each(test_table.data);
 name  |           data           | key | value 
-------+--------------------------+-----+-------
 name1 | {"a": 1, "b": 2}         | a   | 1
 name1 | {"a": 1, "b": 2}         | b   | 2
 name2 | {"c": 3, "d": 4, "e": 5} | c   | 3
 name2 | {"c": 3, "d": 4, "e": 5} | d   | 4
 name2 | {"c": 3, "d": 4, "e": 5} | e   | 5
(5 rows)

This is really handy, but not exactly what I expected. As usual, PostgreSQL turns out to be really, really clever.

CategoryEnglish, CategoryBlog, CategoryPostgreSQL