2024-08-26 Deleting table rows based on data from other tables

Today I have a short tip about PostgreSQL. Some time ago I needed to delete from a table all rows satisfying a certain condition involving another table. It turns out that it is not possible to use join in a delete statement – but there is a way around it (in fact, even two).

Let us assume that we have a table called person with columns person_id and city_id. We also have a table called city, with columns city_id and population. For the sake of a contrived example, we want to delete all people from cities with less than one million inhabitants from our database.

create table city (
  city_id integer primary key,
  name text,
  population integer
);

create table person (
  person_id integer primary key,
  email text,
  city_id integer references city
);

insert into
  city (city_id, name, population)
values
  (1, 'Warsaw', 1800000),
  (2, 'Cracow', 800000);

insert into
  person (person_id, email, city_id)
values
  (1, 'syrenka@warsaw.pl', 1),
  (2, 'krak@cracow.pl', 2),
  (3, 'smok.wawelski@jama.pl', 2);

The standard-compliant way to do this is simple:

delete from person
where
  city_id in (
    select
      city_id
    from
      city
    where
      population < 1000000
  );

It turns out that PostgreSQL has a nice extension which lets us do the same thing a bit more concisely:

delete from person using city
where
  person.city_id = city.city_id
  and population < 1000000;

That’s it for today, happy deleting!

CategoryEnglish, CategoryBlog, CategoryPostgreSQL