2024-08-05 Case-insensitive where in clause

It’s been a few weeks since the last PostgreSQL post here, so let me share a neat trick I learned from one of my teammates recently.

Sometimes you need to compare strings case-insensitively. Assume that you want to find a user with an email bob@builder.com in your database, but you don’t know if he didn’t type his email in upper case (BOB@BUILDER.COM), or maybe capitalized (Bob@Builder.com), etc. Well, easy:

select user_id from user where email ilike 'bob@builder.com'

A minor disadvantage of this method is that if you want to match against something containing a percent sign, an underscore or a backslash, you need to remember to escape them, but it’s not a big deal.

Now assume that you want to find the ids of all people with emails bob@builder.com, pat@postman.com and sam@fireman.com, still case-insensitively. There is no where in ilike operator in PostgreSQL, so what do we do?

Well, if you know about the citext module and type, you may use it instead of text for your email column. But what if that column is of normal text type? Well, nothing is lost!

select
  user_id
from
  user
where
  email::citext in (
    'bob@builder.com',
    'pat@postman.com',
    'sam@fireman.com'
  )

Of course, remember to create extension citext before you do anything with it. By the way, note that in PostgreSQL an extension is “loaded” into the current database, so you need to do this separately for each database you need it in. It is, though, possible to have this extension loaded in each newly created database by loading citext into the template1 database.

I am not 100% sure if this method is always correct – the manual explicitly says that nondeterministic collations are potentially more correct than citext, and anything related to case-sensivity becomes hard once you leave the comfort of ASCII. I wouldn’t be surprised if citext were more efficient and suitable enough for email addresses, though.

That’s it for today – happy querying!

CategoryEnglish, CategoryBlog, CategoryPostgreSQL