Blog

For the English part of the blog, see Content AND Presentation.

2019-09-16 sponge and other moreutils

GNU coreutils are well-known and loved, especially with pipes (of course!). But what may be slightly less known is the collection of command-line tools called moreutils. As their author says, moreutils is a growing collection of the unix tools that nobody thought to write long ago when unix was young. The collection contains a surprisingly large number of very clever little tools. One of them is sponge, which solves the old problem of pipe-manipulating a file in place. Assume that we have a file called file.txt and we want to only leave the lines containing the letter a in it. We can’t just say

grep a file.txt > file.txt

since grep won’t let us; and if it did, it wouldn’t work, as file.txt would be emptied before grep had a chance of going through it all. (You can check that it does so by saying e.g. grep a file.txt | cat > file.txt).

Now, sponge to the rescue. What it does is reads (“soaks”, hence the name) the whole standard input and starts writing only then. We can now say grep a file.txt | sponge file.txt and everything works as expected.

All that is nice and good, and I’ve known it for some time. More recently, however, I looked into more utils from moreutils, ;-) and it turns out that there are quite a few nice ones.

For instance, ts, which works much like cat, but appends a timestamp to every line it processes. It is a nice tool to produce simple log files. Of course, the actual timestamp format is configurable. You can also tell it to produce incremental timestamps, relative to either the previous one or the moment the program was started. Head to the manpage for more options and details.

Yet another one is chronic. It is intended as a wrapper for commands run by cron. The idea is that if a command returns with the exit status of 0, chronic hides its stdout and stderr, but outputs them normally otherwise. This way you may e.g. run a command in cron in verbose mode but do not have any output if it succeeds.

The last one I’m going to mention is ifne. It is very simple – it just runs the given command provided the standard input is not empty, and pipes it in in that case. (It also has an option to invert the default behavior.) It can be very useful when you have to deal with misbehaved command-line utilities which report something to stdout or stderr instead of having a non-zero exit status.

Go to the moreutils website to see more. I am sure you will find something useful.

CategoryEnglish, CategoryBlog

Comments on this page

2019-09-08 PostgreSQL and computing the number of days between two dates

A few days ago I had a very interesting bug. (Frankly, it’s nothing to be proud of, but it’s interesting and also kind of funny, so I thought I’d share it.) I had a long PostgreSQL query which – as one of many conditions – checked if some date was more than two weeks in the future.

I wrote that beast (the query itself was more than one kilobyte – I’m sure there exist larger ones, but it was one of the larger I’ve written) a few weeks ago. I knew I could easily botch it, so I carefully wrote more than a dozen tests, performing various (even unlikely) scenarios. All tests passed (well, eventually). I left the script the query was part of and moved on.

A few days ago I needed to work on this script and related stuff. I changed a bunch of things and ran the test suite again. I was sure that some of the tests would fail because of the changes, but I was fairly confident that the tests of that particular query will pass.

I was wrong.

I started to investigate this, and after ten or twenty minutes I burst into laughter. I realized that the only reason the test passed in the first place was that I ran it in August, and that this test could not possibly pass in September. And the reason was that the two dates in the test were exactly 30 days apart.

Can you see where I’m going?

Here is the thing. My main mistake was that I only skimmed through PostgreSQL manual on date- and time-related functions. What I wanted to achieve was to “calculate the number of days between two dates”, as an integer. Here is the correct way to do this:

select '2019-09-24'::date - '2019-08-25'::date;

(of course, I didn’t use constants in my code). What I did, though, was something unnecessarily complicated:

select extract(days from age('2019-09-24'::date, '2019-08-25'::date));

It seems to work fine. (Most probably, I had a wrong mental model of how the interval type works in PostgreSQL – I somehow thought that extract days from ... would just give me the length of the interval in days, which is obviously wrong.)

What happens if the test was run in September? (The earlier of the dates was always “today”.) Of course,

select age('2019-10-04'::date, '2019-09-04'::date);

gives an interval of 1 mon, so

select extract(days from age('2019-10-04'::date, '2019-09-04'::date));

yields 0!

What is the takeway from that? First of all, read the friendly manual. Especially that PostgreSQL’s manual is really good. A more subtle point is that using “today” in tests might not be the best idea. (To be honest, I think that time-related stuff is a terrible mess, both to code and to test. In the project I’m working on, there are things that are supposed to happen many days after the actions that trigger them. Testing such scenarios is tricky at best.)

Anyway, I hope you had some fun reading this, and maybe even you learned something.

CategoryEnglish, CategoryBlog, CategoryPostgreSQL

Comments on this page

2019-08-31 A simple tip with overlays and diffs

A few days ago I had an interesting problem. I had to resolve a particlarly nasty Git merge. It involved several files with lines’ lengths in the triple digits and a fair number of very small changes. Seeing those changes (in smerge-mode), even after refining the diffs, was tricky – there were many very small patches (sometimes two, sometimes four characters) of changed text and I was quite afraid that I would miss some of them. I searched for a command to “go to the next patch of changes”, but to no avail. Then I decided to write my own.

I started with going to one of these patches and pressing C-u C-x =. This way I learned how Emacs highlights them – using overlays (not text properties). (By the way, this explains the performance hit we experience when viewing very large diffs with these changes highlighted – overlays do not scale. In such cases, I often turn the diff hunk refinement off – D t in Magit.)

I vaguely remembered that Emacs can search for the next overlay or something like that. It turned out that I was right. There is a function next-overlay-change, which starts at a given point and finds the nearest boundary of an overlay. (By the way, there also exist similar functions for text properties and a version which looks for both. Consult the manual for the details.)

The only thing that was left was to wrap it up in some kind of UI. Here’s what I came up with – it is extremely bare-bones (in particular, it also stops at the end of the overlay, which was not useful for me – but I aimed for simplicity and speed of hacking up my solution, not for perfect elegance) but only took 1-2 minutes to write.

(defun goto-next-overlay-change ()
  "Goto the `next-overlay-change'."
  (interactive)
  (goto-char (next-overlay-change (point))))

As usual, this shows where Emacs really shines – it’s extremely well self-documented (including C-u C-x =) and flexible (coding a solution required typing a few lines of Lisp and calling local-set-key with some unused key).

CategoryEnglish, CategoryBlog, CategoryEmacs

Comments on this page

More...