Embedding Python in Elixir, it's Fine
In the recent years, Elixir has been expanding its capabilities in Machine Learning and Data through the Nx (Numerical Elixir) effort. A number of projects emerged (Nx, Explorer, Axon, Bumblebee, Scholar, and more), drawing learnings from decades of work in ecosystems such as Python and R, often standing on the shoulders of C++ and Rust codebases.
When we started, we made the explicit choice to not depend on Python libraries directly. We wanted to design and develop our ecosystem with full control of making the best decisions for Elixir, which would not necessarily match the decisions made for Python. We also wished to avoid bringing to our ecosystem the complexities in getting a Python environment up and running. While young, the Nx ecosystem already enabled running pre-trained ML models, simplifying production systems with a unified AI stack, managing GPU cluster workflows from a notebook, to point a few.
A key component driving the adoption of Elixir in these areas is Livebook, a computational notebook platform that builds on the strengths of the Elixir and Erlang, bringing reproducibility, distributed execution, and app development to the forefront. With Livebook, we have seen a growing interest from teams and companies in dipping their toes into the Elixir ecosystem for the first time.
All of this builds a good case to go all in with Elixir, but some hurdles remain. As one would expect, most companies interested in bringing Elixir and Livebook into their infrastructure, have existing workflows, packages, and repositories that they already rely on. The choices we have made so far imply that they either have to find an equivalent package in Elixir or write one from scratch, increasing the risk and costs of adding Elixir to their data stack.
To address these concerns, today we announce Pythonx, which embeds the Python interpreter within the Erlang VM, bringing automatic data conversion between Elixir and Python, code evaluation, and automatic virtual environment management. We compare Pythonx with other options for interoperability and outline future work.
Enter Pythonx
Imagine we have an image and want to read the text on that image. We need to do what is known as Optical Character Recognition (OCR). Sure enough, there are a few Python packages doing just that, one of them being pytesseract
. For the sake of this example, we will download the image using Req:
Mix.install([
{:pythonx, "~> 0.4.0"},
{:req, "~> 0.5.8"}
])
url = "https://unsplash.com/photos/95t94hZTESw/download?ixid=M3wxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNzQwMDYwMjg4fA&force=true&w=640"
binary = Req.get!(url).body
Now, let’s bring in Python.
Pythonx.uv_init("""
[project]
name = "project"
version = "0.0.0"
requires-python = "==3.13.*"
dependencies = [
"pytesseract==0.3.13",
"pillow==11.1.0"
]
""")
Calling Pythonx.uv_init/1
downloads Python and the listed dependencies using the excellent uv package manager. It also immediately initializes the Python interpreter for evaluation. Note the dependencies section where we list pytesseract
for OCR and pillow
for image handling.
Next, let’s write some Python.
{result, _globals} =
Pythonx.eval(
"""
import pytesseract
import io
import PIL
image = PIL.Image.open(io.BytesIO(binary))
pytesseract.image_to_string(image)
""",
%{"binary" => binary}
)
Pythonx.decode(result)
#=> "The Journey\nof a thousand\nmiles begins\nwith a single\n\nstep.\n\n-Lao Tzu\n\n"
Above we call Pythonx.eval/2
, which accepts Python code and a map with variables for the evaluation. Note how we pass the Elixir binary and it is automatically converted to a bytes
object on the Python side. The evaluation returns result
, which is a %Pythonx.Object{}
, and also an updated map with variables. In this case we only care about the result and we use Pythonx.decode/1
to convert it to an Elixir string right away.
There we go! To learn more about Pythonx, see the documentation. And if you are struggling to write Python for your task, consult with your AI specialist, it went to school for that.
Under the hood
If you are raising your eyebrow, thinking that this just calls python
, bear with me!
So Python, or more specifically its CPython reference implementation, has the interesting capability of being embedded into other applications. What this means is that the core functionality of the Python interpreter is available as a C library, so a C/C++ application can link that library and use its APIs to run code and interact with objects. In fact, you can think of the python
executable as one such application.
Elixir provides C/C++ interoperability via Erlang NIFs and that’s exactly what Pythonx uses to embed Python, which means that the Python interpreter operates in the same OS process as Elixir itself. By living in the same memory space, passing data between Elixir and Python is cheap. Pythonx ties Python and Erlang garbage collection, so that the objects can be safely kept between evaluations. Also, it conveniently handles conversion between Elixir and Python data structures, bubbles Python exceptions and captures standard output.
Livebook goes multilingual
To enable even more powerful workflows, we started working on Python support in Livebook, building on Pythonx. The idea, though, is not to support Python separately, but rather to allow Elixir and Python interacting in the same notebook! To give you a better picture, below you can see the same example using Python cells in Livebook nightly.
data:image/s3,"s3://crabby-images/a6f9a/a6f9ad3e4210a27a9015f747537d001c142c3da5" alt=""
Livebook automatically installs Python and its dependencies, as it manages Elixir’, ensuring a reproducible environment. It also tracks which Elixir variables are used by Python, and vice-versa, and automatically converts them between cells. While there is still work ahead of us, including code completion, documentation, and a few surprises, we are open to feedback. You can download Livebook nightly to give it a try. Once we add all bells and whistles, we will do an official announcement over news.livebook.dev.
At this point I want to thank Cocoa Xu for starting off the work on Embedded Python, and Christopher Grainger for the initial push to run Python in Livebook.
Usage considerations and alternatives
The primary goal of Pythonx is to better integrate Python workflows within Livebook and scripts. Pythonx usage in actual projects must be done with care due to Python’s global interpreter lock (GIL). The GIL prevents multiple threads from executing Python code at the same time, so calling Pythonx
from multiple Elixir processes does not provide the concurrency you might expect and thus it can be a source of bottlenecks. However, this limitation concerns regular Python code. Packages with CPU-intense functionality, such as numpy
, have native implementation of many functions and invoking those releases the GIL. The GIL is also released when waiting on I/O operations. In other words, if you are using this library to integrate with Python, make sure it happens in a single Elixir process or that its underlying libraries can deal with concurrent invocation.
If the above is a dealbreaker, remember that interoperability already exists at a few levels. For example, you could write a Python script and then invoke it with System.cmd/3
or open a Port. In those cases, you could start several or even a pool of Python processes that you would manage.
Furthermore, depending on your needs, you may also be able to interoperate through higher-level abstractions. For example, for AI workflows, you can run pre-trained models directly, some via Bumblebee, others via Ortex. When using an LLM, you often end up talking to a third-party provider, or perhaps you run a drop-in llama.cpp Docker container on-premise, optimised for inference. In such cases the interface is HTTP and Elixir has high-level tools for interacting with LLMs too, namely Instructor and LangChain.
That said, if you do decide that Pythonx fits into your application, you can configure it to download all Python dependencies at compile time and include them as part of the Elixir release. For more details, refer to this section in the doc.
You could also use Pythonx to give you immediate access to more tools to unblock you. Once your idea pays off, you can invest more time to arrive at a Elixir-centric solution, if you so desire.
It’s Fine
Speaking of interoperability, I mentioned that Pythonx uses NIFs. NIFs are Elixir functions with the implementation living in C. We reach for NIFs either when we want to write native code with mutability for something performance-critical or when we integrate with third-party libraries via C API (often both).
To give an example, below is a NIF implementation that adds two numbers.
#include <erl_nif.h>
ERL_NIF_TERM add(ErlNifEnv* env, int argc, const ERL_NIF_TERM argv[]) {
int x, y;
if (argc != 2 || !enif_get_int(env, argv[0], &x) || !enif_get_int(env, argv[1], &y)) {
return enif_make_badarg(env);
}
int result = x + y;
return enif_make_int(env, result);
}
ErlNifFunc nif_funcs[] = {
{"add", 2, add}
};
ERL_NIF_INIT(Elixir.MyLib.NIF, nif_funcs, NULL, NULL, NULL, NULL)
Looking at the signature, you can see that the function receives a C-array of Erlang terms and returns an Erlang term. We are responsible for converting between terms and C data structures using the enif_*
APIs. The example may look pretty straightforward, though it is a fair amount of boilerplate code to end up adding two numbers. From there the ceremony escalates quickly once we need to deal with nested data structures and return errors more specific than :badarg
. A natural progression is to extract some of the logic to helper functions, but this doesn’t fully alleviate the boilerplate and it results in reinventing the wheel a lot.
Additionally, Pythonx (and other NIF-extensive projects) actually use C++, while the enif_*
APIs are (rightfully so) C. Since C++ brings more powerful constructs, theoretically there is a possibility of a more expressive API, however it is also easy to get into weeds with C++ metaprogramming. The main question I asked myself is how far can we go inferring the conversion from types. With Rustler and Zigler, NIFs are written as regular functions and the data structures conversion is handled automatically based on the signature types.
This brings us to Fine, C++ library enabling more ergonomic NIFs, tailored to Elixir. Let’s see an update example:
#include <fine.hpp>
int64_t add(ErlNifEnv *env, int64_t x, int64_t y) {
return x + y;
}
FINE_NIF(add, 0);
FINE_INIT("Elixir.MyLib.NIF");
Other than extendable encoding/decoding, Fine provides smart pointers to safely manage resource objects and support for raising exceptions anywhere in the NIF. I’ve refactored EXLA NIFs to use Fine and it removed over 1k LOC, so it may be worth considering next time you have to write some NIFs.
Summing up
When we started Numerical Elixir, our goal was for Elixir to develop and have its own identity within the data and machine learning ecosystem. Now we are ready to make interoperability a key focus of our efforts too.
Pythonx embeds Python into Elixir, bringing a new class of interoperability with a third-party language not seen before within the Erlang VM. It is more than just integrating the Python interpreter, it is about transparently translating idioms from one language to the other.
The Fine project also consolidates and streamlines our collective experiences in integrating C++ and Elixir, tracing back to Sean Moriarity’s work on Nx four years ago.
There is more to come.
Stay interoperable!