2019-09-08 PostgreSQL and computing the number of days between two dates

A few days ago I had a very interesting bug. (Frankly, it’s nothing to be proud of, but it’s interesting and also kind of funny, so I thought I’d share it.) I had a long PostgreSQL query which – as one of many conditions – checked if some date was more than two weeks in the future.

I wrote that beast (the query itself was more than one kilobyte – I’m sure there exist larger ones, but it was one of the larger I’ve written) a few weeks ago. I knew I could easily botch it, so I carefully wrote more than a dozen tests, performing various (even unlikely) scenarios. All tests passed (well, eventually). I left the script the query was part of and moved on.

A few days ago I needed to work on this script and related stuff. I changed a bunch of things and ran the test suite again. I was sure that some of the tests would fail because of the changes, but I was fairly confident that the tests of that particular query will pass.

I was wrong.

I started to investigate this, and after ten or twenty minutes I burst into laughter. I realized that the only reason the test passed in the first place was that I ran it in August, and that this test could not possibly pass in September. And the reason was that the two dates in the test were exactly 30 days apart.

Can you see where I’m going?

Here is the thing. My main mistake was that I only skimmed through PostgreSQL manual on date- and time-related functions. What I wanted to achieve was to “calculate the number of days between two dates”, as an integer. Here is the correct way to do this:

select '2019-09-24'::date - '2019-08-25'::date;

(of course, I didn’t use constants in my code). What I did, though, was something unnecessarily complicated:

select extract(days from age('2019-09-24'::date, '2019-08-25'::date));

It seems to work fine. (Most probably, I had a wrong mental model of how the interval type works in PostgreSQL – I somehow thought that extract days from ... would just give me the length of the interval in days, which is obviously wrong.)

What happens if the test was run in September? (The earlier of the dates was always “today”.) Of course,

select age('2019-10-04'::date, '2019-09-04'::date);

gives an interval of 1 mon, so

select extract(days from age('2019-10-04'::date, '2019-09-04'::date));

yields 0!

What is the takeway from that? First of all, read the friendly manual. Especially that PostgreSQL’s manual is really good. A more subtle point is that using “today” in tests might not be the best idea. (To be honest, I think that time-related stuff is a terrible mess, both to code and to test. In the project I’m working on, there are things that are supposed to happen many days after the actions that trigger them. Testing such scenarios is tricky at best.)

Anyway, I hope you had some fun reading this, and maybe even you learned something.

CategoryEnglish, CategoryBlog, CategoryPostgreSQL