I work with PostgreSQL on a daily basis, and I often share queries with my teammates. I like to show them nicely formatted queries instead of the mess I sometimes write. A bit surprisingly, I had troubles finding a nice SQL pretty-printer/formatter.
For a few years I used sqlformat, a tool written in Python. I was quite happy with it, until I noticed that it didn’t work with queries referring to jsonb
columns properly. For example, this query
select jsonb_column->'property', jsonb_column->'another'->'subproperty', jsonb_column->'yet'->'another'
becomes this mess.
select jsonb_column->'property', jsonb_column->'another'->'subproperty', jsonb_column->'yet'->'another'
After I had to deal with a query which selected well over twenty or thirty properties from jsonb
columns, I decided to stop and look for another solution.
This time, I found SQL Formatter. This one is written in JavaScript, and can be used both from JS code and command line. Since I’m primarily working with Node.js, this looks very good – there is a potential for logging formatted queries instead of one-line monstrosities, for example. My first use, however, is this quick-and-dirty Emacs command.
(defun sql-format-region () "Format SQL in region using `sql-formatter'." (interactive) (shell-command-on-region (region-beginning) (region-end) "sql-formatter -c ~/.sql-formatter.json" nil t))
enabling me to quickly format an SQL query in the active region. The ~/.sql-formatter.json
file looks like this:
{ "language": "postgresql", "keywordCase": "lower", "dataTypeCase": "lower", "functionCase": "lower" }
As you can see, I’m not fond of screaming at my database.
(My simple command probably should use the r
interactive
code, but I never had the need to call it non-interactively, so there’s no benefit for me to refactoring it into something nicer.) The above query looks after formatting looks like this:
select jsonb_column -> 'property', jsonb_column -> 'another' -> 'subproperty', jsonb_column -> 'yet' -> 'another'
Maybe it’s not my favorite style, but it’s legible and consistent. Here is a more complicated query (which I shamelessly asked an LLM to write for me):
WITH cte AS ( SELECT id, json_column->>'key1' AS value1, json_column->>'key2' AS value2 FROM table1 ) SELECT cte.id, table2.column1, SUM(table2.column2) AS total FROM cte JOIN table2 ON cte.value1 = table2.value1 JOIN table3 ON cte.value2 = table3.value2 WHERE (cte.value1 = 'some_value' AND table2.column3 = 'another_value') OR (cte.value2 = 'another_value' AND table3.column4 = 'some_value') GROUP BY cte.id, table2.column1 HAVING SUM(table2.column2) > 100 ORDER BY cte.id ASC;
and after reformatting, it looks like this:
with cte as ( select id, json_column ->> 'key1' as value1, json_column ->> 'key2' as value2 from table1 ) select cte.id, table2.column1, sum(table2.column2) as total from cte join table2 on cte.value1 = table2.value1 join table3 on cte.value2 = table3.value2 where ( cte.value1 = 'some_value' and table2.column3 = 'another_value' ) or ( cte.value2 = 'another_value' and table3.column4 = 'some_value' ) group by cte.id, table2.column1 having sum(table2.column2) > 100 order by cte.id asc;
After I found SQL Formatter, I stumbled upon Steve Purcell’s sqlformat.el package, which is way more advanced – for example, it supports several SQL formatters. It’s entirely possible that one day I’ll ditch my command in favor of sqlformat.el
, but for now it is sufficient for me. (One reason I might want to use it is that it can use even more formatters than the two I’ve mentioned, and some of them look pretty advanced and configurable.)
CategoryEnglish, CategoryBlog, CategoryPostgreSQL, CategoryEmacs