The fallacies of web application performance
Web application performance has always been a hot topic, especially in regards to the role frameworks play in it. It is common to run into fallacies when those discussions arise and the goal of this article is to highlight some of those.
While I am obviously biased towards Elixir and the role it plays in the performance of web applications, I will do my best to explore fallacies that overplay and underplay the role of performance in web applications. I will also focus exclusively on the server-side of things (which, in many cases, is a fallacy in itself).
Fallacy 1: Performance is only a production concern
In my opinion, the most worrisome aspect of performance discussions is that they tend to focus exclusively on production numbers. However, performance drastically affects development and can have a large impact on developers. The most obvious examples I give in my presentations are compilation times and/or application boot times: an application that takes 2 seconds to boot compared to one that takes 10 seconds has very different effects on the developer experience.
Even response times have direct impact on developers. Imagine web application A takes 10ms on average per request. Web application B takes 50ms. If you have 100 tests that exercise your application, which is not a large number by any measure, the test suite in one application will take 1s, the other will take 5s. Add more tests and you can easily see how this difference grows. A slow feedback cycle during development hurts your team’s productivity and affects their morale. With Elixir and Phoenix, it is common to get sub-millisecond response times and the benefits are noticeable.
When discussing performance, it is also worth talking about concurrency. Everything you do in your computer should be using all cores. Booting your application, compiling code, fetching dependencies, running tests, etc. Even your wrist watch has 2 cores. Concurrency is no longer the special case.
However, you don’t even need multiple cores to start reaping the benefits of concurrency. Imagine that in the test suite above, 30% of the test time is spent on the database. <strong>While one test is waiting on the database, another test should be running</strong>. There is no reason to block your test suite while a single test waits on the database and this change in itself already improves build times.
If multiple cores are available, you should demand even more gains in terms of performance throughout your development and test experiences. The Elixir compiler and built-in tools will use multiple cores whenever possible. The next time a library, tool or framework is taking too long to do something, ask how many cores it is using and what you can do about it.
Fallacy 2: Threads are enough for multi-core concurrency
Once we start to venture into concurrency, a common fallacy is that “if a programming language has threads, it will be equally good at concurrency as any other language”. To understand why this is not true, let’s look at Amdahl’s law.
To quote Wikipedia, Amdahl’s law is a formula which gives the theoretical speedup in latency of the execution of a task at fixed workload that can be expected of a system whose resources are improved:
Amdahl's law applied to number of processors. [From Wikipedia, CC BY-SA 3.0.](https://en.wikipedia.org/wiki/Amdahl%27s_law#/media/File:AmdahlsLaw.svg)
The graph above shows that the speedup of a program is limited by its serial part. If only 50% of the software is parallelizable, the theoretical maximum speedup is 2 times, regardless of how many cores you have in your system.
If 50% of your software is parallelizable, going from 4 to 8 cores gives you only a 11% speed up. If 75% of the software is parallelizable, going from 4 to 8 cores gives you a 27% increase.
In other words, threads are not enough for most web application developers if they still have to explicitly reach out for them. Instead we need abstractions that are used as building blocks. We need good programming models, efficient data structures, and tools. If only a limited part of the software is parallelizable, you will be quickly constrained by Amdahl’s law. Threads are necessary but not sufficient. Writing maintainable and effective concurrent software takes much more.
Fallacy 3: Conclusions drawn from average response times
Another common fallacy in such discussions is when conclusions are drawn based on average data: “Company X handles Y req/second with an average of Zms, therefore you should be fine”.
Here is why conclusions on this data is not enough. First of all, most page loads will experience the 99% server response (also see Everything you know about latency is wrong for more discussion). Whenever you measure averages, also measure the 90%, 95% and 99% percentiles.
Furthermore, in our experience, clients rarely have performance issues during average load, but rather when there are spikes in traffic. It is easy to plan for your average load. The challenge is in measuring how your system behaves when there is a surge in access. When discussing and comparing response times, also ask for the high percentiles, delays and error rates in case of overloads.
Finally, the server response time as a metric is inherently limited. For instance, a fast server means nothing if the client-side is a mess and takes seconds to load. Instead of measuring a single request, consider also measuring how the user interacts with the website within certain goals. Let’s see an example.
Imagine that your application requires the user to confirm their account in order to access part of its functionality (or all of it). Now, preparing for a spike in traffic, you cached your home page as well as your sign-up form. Requests start to pour in and you can see your website is responding fairly well, with low averages and even low 95% percentiles. You consider it a success.
The next day, you are measuring how users interacted with your application and you could notice a unusually high bounce rate when the servers were on high load. Further analysis reveal that, even though the response times were excellent, the messaging system was clogged and instead of waiting 30 seconds to receive a message with instructions to confirm their account, users had to wait 10 minutes. It is safe to say many of those users left and never came back.
Response times are not enough. For queues/jobs, you want to at least measure arrival rates, departure rates and sojourn time. For this particular sign-up feature, you should measure the user engagement: from signing up, to scheduling the message, to delivering the message and the final user interaction with it.
Fallacy 4: Cost-free solutions
This is probably the most common fallacy of all.
If you complain a certain library or framework takes a long time to boot, someone may quickly point out that there is a tool that solves the booting problem by having a runtime always running on the background.
If your web application takes long to render certain views, you will be told to cache it.
The trouble is that those solutions are not cost-free and their cost are often left unsaid. When the tool that runs your application in the background fails, your developers will be the ones debugging it. Between having a solution that addresses a certain problem and not having the problem at all, I prefer the second.
Sometimes the lack of performance or proper tools will affect how your team designs and implements a feature. Time spent on caching and cache expiration is time that could be spent developing features. It is often joked that “cache invalidation and naming things are the two hard things in computer science”. We have discussed similar trade-offs in a previous article about stateful and stateless applications - where a performant solution leads to benefits from development to deployment.
This fallacy also happens when arguing in favor of technologies that are seen as performance centric. For example, if you want to use Elixir or Go, you will have to learn the underlying abstractions for concurrency, namely processes and goroutines, which is a time investment. If you want your tests to run concurrently when talking to the database in your Phoenix applications, you need to learn the pros, cons and pitfalls of doing so, a topic we covered in depth in Ecto.SQL documentation.
It is important to make those costs explicit and part of the discussions.
Fallacy 5: Performance is all that matters
For the majority of companies and teams, that’s simply not the case. Therefore, if you are planning to move to another technology exclusively because of performance, you should have numbers that back up your decision.
Similarly, we often see new languages being dismissed exclusively as “performance fallbacks”, while in many of those languages performance is typically a side-effect. For example, Elixir builds on the Erlang VM and focuses on developer productivity and code maintenance. If you are looking for reducing costs, choosing a language that focuses on productivity and maintainibility will likely be more cost efficient than picking the fastest one. And if you can get extra performance, that’s a nice bonus.
At the end of the day, the discussion about performance is quite nuanced. It is important to know what to measure and how to interpret the data collected. We have learned that performance matter way beyond your production environment and have a large impact in development and testing. And there are no cost-free solutions, be it adding and maintaining a caching layer or picking up a new programming language.
P.S.: This post was originally published on Plataformatec’s blog.