2020-02-15 PostgreSQL and null values in jsonb

Some time ago I was writing some PostreSQL queries involving columns of type jsonb. For some reason I could not understand my query didn’t work as expected. I decided to dig a little bit deeper, and here is what I learned.

As we know, SQL (and PostgreSQL as well, of course) has the null value, which is kind of tricky (for instance, it is not equal to itself). That makes sense, because SQL’s null means “of unknown value”.

As we also know, JSON also has the null value (or type). In JavaScript, it is equal (and even identical) to itself. This also makes sense, because JSON’s null means “empty/no value”.

When we use PostgreSQL to query JSON values which may contain null values, however, hilarity ensues. First of all, if you want to actually type a JSON null in Postgres, you can say e.g. 'null'::jsonb (let’s agree on using jsonb and avoiding json, and calling it just JSON). Of course, this value is different from an SQL null. If you try to say null::jsonb instead, you get an SQL null, since (apparently) casting null to anything results in null (I could not find evidence for this in the PostgreSQL manual, but a few experiments seem to confirm this behavior, and it makes perfect sense).

Now there are at least two ways one can achieve some kind of null when extracting parts of JSON objects: either an explicit null is present in JSON, or the requested property is missing altogether. In the former case, we get a JSON null, and in the latter an SQL null, which is slightly weird given the semantic above, but I can accept it in light of the fact that SQL does not have an “empty” type/value, and there are no natural candidates for the job of representing a value not present in the JSON object (well, technically, PostgreSQL could throw an error in such a case, but it doesn’t – go figure).

Then, there are two ways one can extract something from JSON in PostgreSQL – as JSON (e.g. ->) or as text (e.g. ->>).

Combining the above we get the following results.

select '{"a": 1, "b": null}'::jsonb->'b';

yields a JSON null (so something that does not satisfy the SQL “predicate” is null). However,

select '{"a": 1, "b": null}'::jsonb->'c';

gives a proper SQL null.

Using the text extraction operators, however, we get an SQL null both times:

select '{"a": 1, "b": null}'::jsonb->>'b' is null;
select '{"a": 1, "b": null}'::jsonb->>'c' is null;

The latter is not unexpected at all – we get an SQL null, which (after casting to text) remains null. The former, however, came as a surprise to me. I thought it should be the SQL string 'null', not the SQL null, as casting an explicit JSON null to text gives that string.

The programmers behind PostgreSQL didn’t agree, though. I could not understand why, and I asked a question on StackOverflow about it. After several hours I got a very detailed and interesting response. To cut a long story short, here is the rationale behind this seemingly strange behavior: if PostgreSQL returned 'null' (i.e., a string) in response to select '{"a": 1, "b": null}'::jsonb->>'b';, then the result would be indistinguishable from the result of select '{"a": 1, "b": "null"}'::jsonb->>'b';. One of the design principles of PostgreSQL, however, is that casting anything to text should give something parsable back to the original value (whenever possible).

Well, nice to learn something new.

And while we are at PostgreSQL null and related topics, here is a nice tip I learned some time ago: if you put the line \pset null '∅' in your .psqlrc file, psql will render null values as Unicode “empty set” symbol. It is extremely useful if you need to distinguish between null and an empty string in psql​’s output (unless you have Unicode empty set symbols in your data, of course).

And finally another tip, this time related to the jsonb type, is the jsonb_strip_nulls function. Given a JSONB object, it removes all properties whose values are explicit null​s (and it does it recursively, so deeper levels of hierarchy are affected, too). Note that it keeps null​s that are elements of arrays, though.

CategoryEnglish, CategoryBlog, CategoryPostgreSQL