Soft deletes with Ecto and PostgreSQL
- José Valim
- August 13th, 2024
- ecto, postgres, soft deletes
One of our clients at Dashbit has recently asked us the different ways to implement soft-deletes in their Phoenix + Ecto application.
The idea of a soft-delete is that, when you choose to “delete” a given resource, let’s call it “orders”, instead of effectively deleting it from the database, you will mark the order as deleted, and then you simply do not show such orders to the user.
There are several ways to implement such a feature. If you need it only for one resource or another, sometimes the best option is to handle it at the application level. The downsides of doing it at the application level is that you are responsible to make sure they are never deleted, which can be error prone, especially once you consider foreign keys cascade and deletion rules. Another common problem is that, on every query you perform, you must remember to filter it only to non-“soft-deleted” resources.
Alternatively, if soft deletion plays an important role in your application and applies to several resources, then enforcing it at the database level can be a robust option, as you guarantee no one can accidentally remove this data unintentionally.
In this post, we will discuss one approach to implement soft-deletion in the database, with rules and views. This article is an adaption of “Soft deletion with PostgreSQL: but with logic on the database!”, but focused on Ecto.
Custom deletions with PostgreSQL rules
In order to add soft-deletion to a resource, the first step is to add a deleted_at
column to its table. In this case, we are choosing a timestamp column, but a boolean
deleted
column would suffice too:
add :deleted_at, :utc_datetime
Still in the migration, let’s add PostgreSQL Rule, that converts deletes into updates:
execute """
CREATE OR REPLACE RULE soft_deletion AS ON DELETE TO orders
DO INSTEAD UPDATE orders SET deleted_at = NOW() WHERE id = OLD.id AND deleted_at IS NULL RETURNING OLD.*;
""",
"""
DROP RULE IF EXISTS soft_deletion ON orders;
"""
When migrating, we create a soft_deletion
rule on the orders table that effectively replaces
the deletion by an update statement that sets the deleted_at
column. Notice the statement
also uses RETURNING OLD.*
, this is necessary if you have any :read_after_writes
field in
your schema.
Custom selects with PostgreSQL views
By using a rule, we guarantee that no one can accidentally delete this resource, even if they are connected to the database terminal (we will see some explicit ways to bypass it though).
However, this still leaves one problem: whenever you ask Ecto
to read all orders, such as
Repo.all(Order)
, it will include deleted orders by default. Therefore you, the application
developer, must remember to filter them.
Luckily, we can solve this problem by adding two lines to our migration:
execute "CREATE OR REPLACE VIEW visible_orders AS SELECT * FROM orders WHERE deleted_at IS NULL",
"DROP VIEW IF EXISTS visible_orders"
The example above creates a new view, called “visible_orders”, which only contains the visible
orders. Now you only need to change your Ecto
schema to use this new view:
defmodule MyApp.Order do
use Ecto.Schema
schema "visible_orders" do
...
field :deleted_at, :utc_datetime
timestamps()
end
end
This brings two benefits. First, Repo.all(MyApp.Order)
now only returns visible
orders. Furthermore, because Ecto treats tables (sources) and schemas as distinct
entities, if you want to query all orders, including deleted ones, you can simply
write: Repo.all({"orders", MyApp.Order})
. Or in a query:
from o in {"orders", MyApp.Order},
join: u in assoc(o, :users),
where: o.total >= ^min_value,
...
Any query you execute in this style still has all of the type casting and SQL injection protections you are familiar with, but now applied to different views/tables. Furthermore, because our view is “updatable” (it only has simple WHERE clauses), we can insert, update, and delete entries directly from the view, without a need to specify the source table for those operations.
The caveats
There are two caveats in this approach. The first one is that, if you attempt
to delete an order, Ecto
won’t be happy with it:
Repo.delete(order)
This raises Ecto.StaleEntryError
by default. This happens because, since the
deletion does not happen, PostgreSQL returns the information that zero rows
were affected. In turn, Ecto parses this result set to mean that the record
no longer exists (and therefore it is stale). This is clearly spelled out in
the error message and, from Ecto v3.12, you can opt-in and say that stale
entries are expected:
Repo.delete(order, allow_stale: true)
Similarly, if you attempt to use Repo.delete_all(from o in Order)
, Ecto will
always report that zero rows were affected. In case you do need to know the
number of rows affected, you can use Repo.update_all
and manually set its
deleted_at
column instead.
The other caveats are related to indexes, constraints, and foreign keys. If
you are setting an index for performance, you may want to consider if the
deleted_at
should be included or not. Similarly, if you are adding constraints,
such as unique indexes or check constraints, you may not want them to apply
to deleted at. Luckily, PostgreSQL (as well as Ecto) supports partial indexes,
and you can apply these constraints only when deleted_at IS NULL
.
Cascading foreign keys also require attention when deleting resources and they may require you to adapt the database rule accordingly. This is covered in the original article, so check it out.
Making deletions possible
Before we wrap this up, there is one last question to answer: what happens when you want/need to effectively delete the data?
Luckily, PostgreSQL has a mechanism to enable and disable rules, which you can also use from within a transaction:
Repo.transaction(fn ->
Repo.query!("ALTER TABLE orders DISABLE RULE soft_deletion")
Repo.delete!(order)
Repo.query!("ALTER TABLE orders ENABLE RULE soft_deletion")
end)
In fact, you could even encapsulate this by adding a function to your Ecto.Repo
module:
defmodule MyApp.Repo do
use Ecto.Repo, otp_app: :myapp
def disable_soft_deletion(tables, fun) do
transaction(fn ->
for table <- tables do
query!("ALTER TABLE #{quote_table(table)} DISABLE RULE soft_deletion")
end
try do
fun.()
after
for table <- tables do
query!("ALTER TABLE #{quote_table(table)} ENABLE RULE soft_deletion")
end
end
end)
end
defp quote_table(name) when is_binary(name) do
if String.contains?(name, "\"") do
raise "invalid table name"
end
[?", name, ?"]
end
end
And now:
Repo.disable_soft_deletion(["orders"], fn ->
Repo.delete!(order)
end)
Wrapping up
This article showed how to achieve soft deletions in Ecto and PostgreSQL
by using rules and views. Not only that, we learned how Ecto allows us to
apply the same schema to different tables/views, and how it deals
with stale records on delete
(which also applies to update
). Those
ideas can be useful outside of soft deletion.
The soft-deletion technique discussed here is useful when you want to keep deleted in the same table (and therefore with the same constraints and foreign keys) as regular data. But your application may have different needs. For example, if you want to implement some sort of trash bin, that stores all types of deleted resources in your application or you simply want to keep deleted data for auditing purposes, a simpler solution is to use triggers to copy a JSON representation of the deleted data to another table. We hope this article provides some options for you to explore whenever this question arises.
Happy coding!