Soft deletes with Ecto and PostgreSQL

One of our clients at Dashbit has recently asked us the different ways to implement soft-deletes in their Phoenix + Ecto application.

The idea of a soft-delete is that, when you choose to “delete” a given resource, let’s call it “orders”, instead of effectively deleting it from the database, you will mark the order as deleted, and then you simply do not show such orders to the user.

There are several ways to implement such a feature. If you need it only for one resource or another, sometimes the best option is to handle it at the application level. The downsides of doing it at the application level is that you are responsible to make sure they are never deleted, which can be error prone, especially once you consider foreign keys cascade and deletion rules. Another common problem is that, on every query you perform, you must remember to filter it only to non-“soft-deleted” resources.

Alternatively, if soft deletion plays an important role in your application and applies to several resources, then enforcing it at the database level can be a robust option, as you guarantee no one can accidentally remove this data unintentionally.

In this post, we will discuss one approach to implement soft-deletion in the database, with rules and views. This article is an adaption of “Soft deletion with PostgreSQL: but with logic on the database!”, but focused on Ecto.

Custom deletions with PostgreSQL rules

In order to add soft-deletion to a resource, the first step is to add a deleted_at column to its table. In this case, we are choosing a timestamp column, but a boolean deleted column would suffice too:

add :deleted_at, :utc_datetime

Still in the migration, let’s add PostgreSQL Rule, that converts deletes into updates:

execute """
        CREATE OR REPLACE RULE soft_deletion AS ON DELETE TO orders
        DO INSTEAD UPDATE orders SET deleted_at = NOW() WHERE id = OLD.id AND deleted_at IS NULL RETURNING OLD.*;
        """,
        """
        DROP RULE IF EXISTS soft_deletion ON orders;
        """

When migrating, we create a soft_deletion rule on the orders table that effectively replaces the deletion by an update statement that sets the deleted_at column. Notice the statement also uses RETURNING OLD.*, this is necessary if you have any :read_after_writes field in your schema.

Custom selects with PostgreSQL views

By using a rule, we guarantee that no one can accidentally delete this resource, even if they are connected to the database terminal (we will see some explicit ways to bypass it though).

However, this still leaves one problem: whenever you ask Ecto to read all orders, such as Repo.all(Order), it will include deleted orders by default. Therefore you, the application developer, must remember to filter them.

Luckily, we can solve this problem by adding two lines to our migration:

execute "CREATE OR REPLACE VIEW visible_orders AS SELECT * FROM orders WHERE deleted_at IS NULL",
        "DROP VIEW IF EXISTS visible_orders"

The example above creates a new view, called “visible_orders”, which only contains the visible orders. Now you only need to change your Ecto schema to use this new view:

defmodule MyApp.Order do
  use Ecto.Schema

  schema "visible_orders" do
    ...
    field :deleted_at, :utc_datetime
    timestamps()
  end
end

This brings two benefits. First, Repo.all(MyApp.Order) now only returns visible orders. Furthermore, because Ecto treats tables (sources) and schemas as distinct entities, if you want to query all orders, including deleted ones, you can simply write: Repo.all({"orders", MyApp.Order}). Or in a query:

from o in {"orders", MyApp.Order},
  join: u in assoc(o, :users),
  where: o.total >= ^min_value,
  ...

Any query you execute in this style still has all of the type casting and SQL injection protections you are familiar with, but now applied to different views/tables. Furthermore, because our view is “updatable” (it only has simple WHERE clauses), we can insert, update, and delete entries directly from the view, without a need to specify the source table for those operations.

The caveats

There are two caveats in this approach. The first one is that, if you attempt to delete an order, Ecto won’t be happy with it:

Repo.delete(order)

This raises Ecto.StaleEntryError by default. This happens because, since the deletion does not happen, PostgreSQL returns the information that zero rows were affected. In turn, Ecto parses this result set to mean that the record no longer exists (and therefore it is stale). This is clearly spelled out in the error message and, from Ecto v3.12, you can opt-in and say that stale entries are expected:

Repo.delete(order, allow_stale: true)

Similarly, if you attempt to use Repo.delete_all(from o in Order), Ecto will always report that zero rows were affected. In case you do need to know the number of rows affected, you can use Repo.update_all and manually set its deleted_at column instead.

The other caveats are related to indexes, constraints, and foreign keys. If you are setting an index for performance, you may want to consider if the deleted_at should be included or not. Similarly, if you are adding constraints, such as unique indexes or check constraints, you may not want them to apply to deleted at. Luckily, PostgreSQL (as well as Ecto) supports partial indexes, and you can apply these constraints only when deleted_at IS NULL.

Cascading foreign keys also require attention when deleting resources and they may require you to adapt the database rule accordingly. This is covered in the original article, so check it out.

Making deletions possible

Before we wrap this up, there is one last question to answer: what happens when you want/need to effectively delete the data?

Luckily, PostgreSQL has a mechanism to enable and disable rules, which you can also use from within a transaction:

Repo.transaction(fn ->
  Repo.query!("ALTER TABLE orders DISABLE RULE soft_deletion")
  Repo.delete!(order)
  Repo.query!("ALTER TABLE orders ENABLE RULE soft_deletion")
end)

In fact, you could even encapsulate this by adding a function to your Ecto.Repo module:

defmodule MyApp.Repo do
  use Ecto.Repo, otp_app: :myapp

  def disable_soft_deletion(tables, fun) do
    transaction(fn ->
      for table <- tables do
        query!("ALTER TABLE #{quote_table(table)} DISABLE RULE soft_deletion")
      end

      try do
        fun.()
      after
        for table <- tables do
          query!("ALTER TABLE #{quote_table(table)} ENABLE RULE soft_deletion")
        end
      end
    end)
  end

  defp quote_table(name) when is_binary(name) do
    if String.contains?(name, "\"") do
      raise "invalid table name"
    end

    [?", name, ?"]
  end
end

And now:

Repo.disable_soft_deletion(["orders"], fn ->
  Repo.delete!(order)
end)

Wrapping up

This article showed how to achieve soft deletions in Ecto and PostgreSQL by using rules and views. Not only that, we learned how Ecto allows us to apply the same schema to different tables/views, and how it deals with stale records on delete (which also applies to update). Those ideas can be useful outside of soft deletion.

The soft-deletion technique discussed here is useful when you want to keep deleted in the same table (and therefore with the same constraints and foreign keys) as regular data. But your application may have different needs. For example, if you want to implement some sort of trash bin, that stores all types of deleted resources in your application or you simply want to keep deleted data for auditing purposes, a simpler solution is to use triggers to copy a JSON representation of the deleted data to another table. We hope this article provides some options for you to explore whenever this question arises.

Happy coding!