It’s been a few weeks since the last PostgreSQL post here, so let me share a neat trick I learned from one of my teammates recently.
Sometimes you need to compare strings case-insensitively. Assume that you want to find a user with an email bob@builder.com
in your database, but you don’t know if he didn’t type his email in upper case (BOB@BUILDER.COM
), or maybe capitalized (Bob@Builder.com
), etc. Well, easy:
select user_id from user where email ilike 'bob@builder.com'
A minor disadvantage of this method is that if you want to match against something containing a percent sign, an underscore or a backslash, you need to remember to escape them, but it’s not a big deal.
Now assume that you want to find the ids of all people with emails bob@builder.com
, pat@postman.com
and sam@fireman.com
, still case-insensitively. There is no where in ilike
operator in PostgreSQL, so what do we do?
Well, if you know about the citext
module and type, you may use it instead of text
for your email
column. But what if that column is of normal text
type? Well, nothing is lost!
select user_id from user where email::citext in ( 'bob@builder.com', 'pat@postman.com', 'sam@fireman.com' )
Of course, remember to create extension citext
before you do anything with it. By the way, note that in PostgreSQL an extension is “loaded” into the current database, so you need to do this separately for each database you need it in. It is, though, possible to have this extension loaded in each newly created database by loading citext into the template1 database.
I am not 100% sure if this method is always correct – the manual explicitly says that nondeterministic collations are potentially more correct than citext
, and anything related to case-sensivity becomes hard once you leave the comfort of ASCII. I wouldn’t be surprised if citext
were more efficient and suitable enough for email addresses, though.
That’s it for today – happy querying!