Working with Ecto associations and embeds

This blog post aims to document how to work with associations in Ecto, covering how to read, insert, update and delete associations and embeds. At the end, we give a more complex example that uses Ecto associations to build nested forms in Phoenix.

This article expects basic knowledge Ecto, particularly how repositories, schema and the query syntax work. You can learn more about those in Ecto docs.

Associations

Associations in Ecto are used when two different sources (tables) are linked via foreign keys.

A classic example of this setup is “Post has many comments”. First create the two tables in migrations:

create table(:posts) do
  add :title, :string
  add :body, :text
  timestamps
end
create table(:comments) do
  add :post_id, references(:posts)
  add :body, :text
  timestamps
end

Each comment contains a post_id column that by default points to a post id.

And now define the schemas:

defmodule MyApp.Blog.Post do
  use Ecto.Schema

  schema "posts" do
    field :title
    field :body
    has_many :comments, MyApp.Blog.Comment
    timestamps
  end
end

defmodule MyApp.Blog.Comment do
  use Ecto.Schema

  schema "comments" do
    field :body
    belongs_to :post, MyApp.Blog.Post
    timestamps
  end
end

All the schema definitions like field, has_many and others are defined in Ecto.Schema.

Similar to has_many/3, a schema can also invoke has_one/3 when the parent has at most one child entry. For example, you could think of a metadata association where “Post has one metadata” and the “Metadata belongs to post”.

The difference between has_one/3 and belongs_to/3 is that the foreign key is always defined in the schema that invokes belongs_to/3. You can think of the schema that calls has_* as the parent schema and the one that invokes belongs_to as the child one.

Querying associations

One of the benefits of defining associations is that they can be used in queries. For example:

Repo.all from p in Post,
           preload: [:comments]

Now all posts will be fetched from the database with their associated comments. The example above will perform two queries: one for loading all posts and another for loading all comments. This is often the most efficient way of loading associations from the database (even if two queries are performed) because we need to receive and parse only POSTS + COMMENTS results.

It is also possible to preload associations using joins while performing more complex queries. For example, imagine both posts and comments have votes and you want only comments with more votes than the post itself:

Repo.all from p in Post,
           join: c in assoc(p, :comments),
           where: c.votes > p.votes
           preload: [comments: c]

The example above will now perform a single query, finding all posts and the respective comments that match the criteria. Because this query performs a JOIN, the number of results returned by the database is POSTS * COMMENTS, which Ecto then processes and associates all comments into the appropriate post.

Finally, Ecto also allows data to be preloaded into structs after they have been loaded via the Repo.preload/3 function:

Repo.preload posts, :comments

This is specially handy because Ecto does not support lazy loading. If you invoke post.comments and comments have not been preloaded, it will return Ecto.Association.NotLoaded. Lazy loading is often a source of confusion and performance issues and Ecto pushes developers to do the proper thing. Therefore Repo.preload/3 allow associations to be explicitly loaded anywhere, at any time.

Manipulating associations

While Ecto allows you insert a post with multiple comments in one operation:

Repo.insert!(%Post{
  title: "Hello",
  body: "world",
  comments: [
    %Comment{body: "Excellent!"}
  ]
})

Many times you may want to break it into distinct steps so you have more flexibility in managing those entries. For example, you could use changesets to build your posts and comments along the way:

post = Ecto.Changeset.change(%Post{}, title: "Hello", body: "world")
comment = Ecto.Changeset.change(%Comment{}, body: "Excellent!")
post_with_comments = Ecto.Changeset.put_assoc(post, :comments, [comment])
Repo.insert!(post_with_comments)

Or by handling each entry individually inside a transaction:

Repo.transaction fn ->
  post = Repo.insert!(%Post{title: "Hello", body: "world"})

  # Build a comment from the post struct
  comment = Ecto.build_assoc(post, :comments, body: "Excellent!")

  Repo.insert!(comment)
end

Ecto.build_assoc/3 builds the comment using the id currently set in the post struct. It is equivalent to:

%Comment{post_id: post.id, body: "Excellent!"}\

The Ecto.build_assoc/3 function is specially useful in Phoenix controllers. For example, when creating the post, one would do:

Ecto.build_assoc(current_user, :post)

As we likely want to associate the post to the user currently signed in the application. In another controller, we could build a comment for an existing post with:

Ecto.build_assoc(post, :comments)

Ecto does not provide functions like post.comments << comment that allows mixing persisted data with non-persisted data. The only mechanism for changing both post and comments at the same time is via changesets which we will explore when talking about embeds and nested associations.

Deleting associations

When defining a has_many/3, has_one/3 and friends, you can also pass a :on_delete option that specifies which action should be performed on associations when the parent is deleted.

has_many :comments, MyApp.Blog.Comment, on_delete: :delete_all

Besides the value above, :nilify_all is also supported, with :nothing being the default. Check has_many/3 docs for more information.

Embeds

Besides associations, Ecto also supports embeds in some databases. With embeds, the child is embedded inside the parent, instead of being stored in another table.

Databases like PostgreSQL uses a mixture of JSONB (embeds_one/3) and ARRAY columns to provide this functionality (both JSONB and ARRAY are supported by default and first-class citizens in Ecto).

Working with embeds is mostly the same as working with another field in a schema, except when it comes to manipulating them. Let’s see an example:

defmodule MyApp.Blog.Permalink do
  use Ecto.Schema

  embedded_schema do
    field :url
    timestamps
  end
end

defmodule MyApp.Blog.Post do
  use Ecto.Schema

  schema "posts" do
    field :title
    field :body
    has_many :comments, MyApp.Comment
    embeds_many :permalinks, MyApp.Permalink
    timestamps
  end
end

It is possible to insert a post with multiple permalinks directly:

Repo.insert!(%Post{
  title: "Hello",
  permalinks: [
    %Permalink{url: "example.com/thebest"},
    %Permalink{url: "another.com/mostaccessed"}
  ]
})

Similar to associations, you may also manage those entries using changesets:

# Generate a changeset for the post
changeset = Ecto.Changeset.change(post)

# Let's track the new permalinks
changeset = Ecto.Changeset.put_embed(changeset, :permalinks,
  [%Permalink{url: "example.com/thebest"},
   %Permalink{url: "another.com/mostaccessed"}]
)

# Now insert the post with permalinks at once
post = Repo.insert!(changeset)

Now if you want to replace or remove a particular permalink, you can work with permalinks as a collection and then just put it as a change again:

# Remove all permalinks from example.com
permalinks = Enum.reject post.permalinks, fn permalink ->
  permalink.url =~ "example.com"
end

# Let's create a new changeset
changeset =
  post
  |> Ecto.Changeset.change
  |> Ecto.Changeset.put_embed(:permalinks, permalinks)

# And update the entry
post = Repo.update!(changeset)

The beauty of working with changesets is that they keep track of all changes that will be sent to the database and we can introspect them at any time. For example, if we called before Repo.update!/3:

IO.inspect(changeset.changes.permalinks)

We would see something like:

[%Ecto.Changeset{action: :delete, changes: %{},
                 data: %Permalink{url: "example.com/thebest"}},
 %Ecto.Changeset{action: :update, changes: %{},
                 data: %Permalink{url: "another.com/mostaccessed"}}]

If, by any chance, we were also inserting a permalink in this operation, we would see another changeset there with action :insert.

Changesets contain a complete view of what is changing, how they are changing and you can manipulate them directly.

Nested associations and embeds

This section was written for Phoenix v1.6 and earlier and therefore it does not use the Phoenix.Component and conveniences.

The same way we have used changesets to manipulate embeds, we can also use them to change child associations at the same time we are manipulating the parent.

One of the benefits of this feature is that we can use them to build nested forms in a Phoenix application. While nested forms in other languages and frameworks can be confusing and complex, Ecto uses changesets and explicit validations to provide a straightforward and simple way to manipulate multiple structs at once.

To finish this post, let’s see an example of how to use what we have seen so far to work with nested associations in Phoenix.

First, create a new Phoenix application if you haven’t yet. The Phoenix guides can help you get started with that if it is your first time using Phoenix.

The example we will build is a classic to do list, where a list has many items. Let’s generate the TodoList resource inside the Tasks namespace:

mix phx.gen.html Tasks TodoList todo_lists title

Follow the steps printed by the command above and after let’s generate a TodoItem schema:

mix phx.gen.schema Tasks TodoItem todo_items body:text todo_list_id:references:todo_lists

Open up the MyApp.Tasks.TodoList module at “lib/my_app/tasks/todo_list.ex” and add the has_many definition inside the schema block:

has_many :todo_items, MyApp.Tasks.TodoItem

Next let’s also cast “todo_items” on the TodoList changeset function:

def changeset(todo_list, params \\ %{}) do
  todo_list
  |> cast(params, [:body])
  |> cast_assoc(:todo_items, required: true)
end

Note we are using cast_assoc instead of put_assoc in this example. Both functions are defined in Ecto.Changeset. cast_assoc (or cast_embed) is used when you want to manage associations or embeds based on external parameters, such as the data received through Phoenix forms. In such cases, Ecto will compare the data existing in the struct with the data sent through the form and generate the proper operations. On the other hand, we use put_assoc (or put_embed) when we aleady have the associations (or embeds) as structs and changesets loaded in memory, and we simply want to tell Ecto to take those entries as is.

Because we have added todo_items as a required field, we are ready to submit them through the form. So let’s change our template to submit todo items too. Open up “lib/my_app_web/templates/todo_list/form.html.eex” and add the following between the title input and the submit button:

<%= inputs_for f, :todo_items, fn i -> %>
  <div class="form-group">
    <%= label i, :body, "Task ##{i.index + 1}", class: "control-label" %>
    <%= text_input i, :body, class: "form-control" %>
    <%= if message = i.errors[:body] do %>
      <span class="help-block"><%= message %></span>
    <% end %>
  </div>
<% end %>

The inputs_for/4 function comes from Phoenix.HTML.Form and it allows us to generate fields for an association or an embed, emitting a new form struct (represented by the variable i in the example above) for us to work with. Inside the inputs_for/4 function, we generate a text input for each item.

Now that we have changed the template, the final step is to change the new action in the controller to include two empty todo items by default in the todo list:

changeset = TodoList.changeset(%TodoList{todo_items: [%MyApp.TodoItem{}, %MyApp.TodoItem{}]})

Head to “http://localhost:4000/todo_lists” and you can now create a todo list with both items! However, if you try to edit the newly created todo list, you should get an error:

attempting to cast or change association :todo_items for MyApp.TodoList that was not loaded.
Please preload your associations before casting or changing the schema.

As the error message says we need to preload the todo items for both edit and update actions in MyApp.TodoListController. Open up your controller and change the following line on both actions:

    todo_list = Repo.get!(TodoList, id)

to

    todo_list = Repo.get!(TodoList, id) |> Repo.preload(:todo_items)

Now it should also be possible to update the todo items alongside the todo list.

Both insert and update operations are ultimately powered by changesets, as we can see in our controller actions:

changeset = TodoList.changeset(todo_list, todo_list_params)

All the benefits we have discussed regarding changesets in the previous section still apply here. By inspecting the changeset before calling Repo.insert or Repo.update, it is possible to see a snapshot of all the changes that are going to happen in the database.

Not only that, the validation process behind changesets is explicit. Since we added todo_items as a required field in the todo list schema, every time we call MyApp.Tasks.TodoList.changeset/2, MyApp.Tasks.TodoItem.changeset/2 will be called for every todo item sent through the form. The changesets returned for each todo item is then stored in the main todo list changeset (it is effectively a tree of changes).

To help us build our intuition regarding changesets a bit more, let’s add some validations to todo items and also allow them to be deleted.

Deleting todo items

Ecto v3.10 and later supports an option called :drop_param and :sort_param on cast_assoc, which allows you to specify a list of IDs to be dropped from the association as well a custom sorting order. With these new features, you no longer need to specify a virtual field for deletion as shown below. Instead you define a checkbox which will submit the current item ID for deletion once checked.

Open up MyApp.Tasks.TodoItem at “lib/my_app/tasks/todo_item.ex” and add a virtual field named :delete to the schema:

field :delete, :boolean, virtual: true

As we know the MyApp.Tasks.TodoItem.changeset/2 function is the one invoked by default when manipulating todo items through todo lists. So let’s change it to the following:

def changeset(todo_item, params \\ :empty) do
  todo_item
  |> cast(params, [:body, :delete])
  |> validate_required([:body])
  |> validate_length(:body, min: 3)
  |> mark_for_deletion() # 2. Call mark for deletion
end

defp mark_for_deletion(changeset) do
  # If delete was set and it is true, let's change the action
  if get_change(changeset, :delete) do
    %{changeset | action: :delete}
  else
    changeset
  end
end

We have added a call to validate_length as well as a private function that checks if the :delete field changed and, if so, we mark the changeset action to be :delete.

The functions cast, validate_length, get_change and more are all part of the Ecto.Changeset module, which is automatically imported into Ecto schemas.

Let’s now change our view to include the delete field. Add the following somewhere inside the inputs_for/4 call in “web/templates/todo_list/form.html.eex”:

<%= if i.data.id do %>
  <span class="pull-right">
    <%= label i, :delete, "Delete?", class: "control-label" %>
    <%= checkbox i, :delete %>
  </span>
<% end %>

And that’s all. Our todo items should now validate the body as well as allow deletion on update pages!

Notice we had control over the changeset and validations at all times. There are no special fields for deletion or implicit validation. Still, we were able to wire everything up with very few lines of codes.

And while the default is to call MyApp.TodoItem.changeset/2, it is possible to customize the function to be invoked when casting todo items from the todo list changeset via the :with option:

|> cast_assoc(:todo_items, required: true, with: &custom_changeset/2)

Therefore if an association has different validation rules depending if it is sent as part of a nested association or when managed directly, we can easily keep those business rules apart by providing two different changeset functions. And because we just use functions, all the way down, they are easy to compose and test.

Summing up

In this blog post we have learned the foundations for working with associations and embeds, up to a more complex example using nested associations. If you want to further customize their behavior, read the docs for declaring the associations/embeds in Ecto.Schema or how to further manipulate changesets via Ecto.Changeset.

When it comes to the view, you can find more information on the Phoenix.HTML project, specially under the Phoenix.HTML.Form, where the inputs_for/4 function is defined.

P.S.: This post was originally published on Plataformatec’s blog.