Application Performance isn't the most important factor in application development
NOTE: This post is only my personal view during my course of work across application development and devoloper operations roles across multiple roles and multiple companies and side projects. This might probably sound like random rambling to a software developer that is working in the industry but sometimes, it gets pretty irritating where people throw reasons that certain decisions should be made for “performance” and provide vague reasons for it.
When one creates a application, there are various concerns that they need to focus on in order to safely get it into production. Definitely the main concern would be the development of the business logic - this is needed to be developed in order to help the company make money/save money. This should be priority one; all other concerns are some sort of secondary to that initial goal. Some of the other concerns that we need to also take into account would be:
- Application security
- Operationability (ease of operating application in hosting environment)
All of the factors listed above are definitely important but I do feel that “performance” is definitely not as important as some of the others such as security and operationability. I’ll expand on this further down in the post.
Security is definitely one of the more important factors, after development of business logic. (Some would even argue that one should focus on security more than business logic concerns). Security is especially important nowadays, since applications are usually exposed to the world wide web and any hack can potentially result in disastrous loss of data and trust in the applications that are being run by the company. That would inadvertably affect the bottom line - thereby, making this priority 1 concern to tackle. Many of the common security issues can be avoided by following the usual best practises when it comes to deployment and application development. One example would be sql injection. This issue is potentially dangerous due to the information that can be potentially leaked from the application when the right query is put into text box on the frontend. Or another scenario that would be worst would be the user inserting a sql command that drops tables. I don’t think there is much to argue about the importance of security compared to performance concerns; I will be hard-pressed to find anyone who can easily just ok changes in application to make it perform better (but resulting in additional security risk to users or making the application less useful for users).
Operationability is the next point I intend to cover and I generally believe that this should place higher priority as compared performance as well. What I mean by operationability is the ease/difficulty to run and manage the application in production settings. Some of the operation aspects that a person would need to care about for an application would be how easy to upgrade/rollback application as and when it’s needed, procedures for scaling the various of the aspects - especially when it comes to data storage (e.g. reliance on various storage mechanisms of data such as databases etc).
So why operability is important? One primary reason is that running an application takes effort and time and resources. If you have an application that is “highly performant” but comes at the expense of requiring a large amount of support hours - you best make sure that the “highly performant” application is an extremely valuable application for the company. If such an application requires a lot of hand holding as well as eyeballs to ensure that it operates well - then, it would be very very expensive to run. Headcount needs to spent on it and hence, that’s why the previous statement was mentioned; the application has to be very “valuable” in order to allow such headcount usage. And this is on the big assumption that we can properly attribute to which application that a company produces which is the company’s main money maker.
Money/expenditure to support applications is not the only reason. If an application requires a large amount of support in order to support its running in production, that would most likely mean that the support engineers/technicians (SRE?) coming in to support the application would be required a large number of manual steps to keep the application alive. I’ve never seen a smooth way where having a human run a number of manual steps to properly run/debug an application will go error-free. Especially if you require those same manual steps be repeated across multiple regions/zones for the application. There is an extremely likelihood an error is gonna appear and when it does, the application developer is definitely gonna be dragged in to try to fix said errors/issues.
Some of the steps taken to make sure that an applications’s is more “operable” would be to ensure that application is simple to run and it is inline with common processes that other applications that the company has running on production. E.g. Let’s say a company has like 20+ web applications connected to MySQL databases running in production; adding another web application that connects to a MySQL database would be way more trivial as compared to maybe running a web application that connects to another database that the company is not used to operate (e.g. Cassandra). New processes/automations need to be built.
Other ways is to invest in the time and effort to write up automation scripts using CI/CD tools and technologies such as Ansible/Terraform/Jenkins etc. The investment in such tools pays off almost immediately - and it would definitely save amount of time and effort to maintain such applications in production.
And we now are in the main part of this blog post, which is to kind of explain why performance should not be prioritized as importantly as other factors. I’m definitely not saying the application performance is not important - more like its “less” important compared to factors such as operability and security of the application. It is more “important-er” to make sure that the applications that we built are “secure” and “easy to operate”.
The thing about performance, is that there is no “end goal” for performance. There is usually always something to do to make an application more “performant” but the thing that we need to identify and understand is that performance matters in order to allows companies be more efficient with the resources that they have. It’s all about the money. The application should be optimized such that the cost of running is as low as possible but doesn’t result in constant issues of requiring support to ensure that the application is kept running.
This kind of makes one wonder - how does one decide if the application is “performant” enough? And that is where is where monitoring as well as SRE principles come in. Let’s take an example, if we have a API endpoint and it is especially important that the latency of the endpoint is low. But how low should it go? Maybe a latency of 1s for 99% of requests is good enough? Or 1s for 99.9% of requests? Once this is defined, we can alter our application architecture to just meet this goal as simply as possible. The easier the codebase that meets that “performance” goal, the easier it is for application developers to support the application; the easier it is for the SRE team to support the application in production etc. Essentially, the tldr to that: “Do just enough engineering to meet our business requirements” and never do more.
I’ve seen my fair share of stories of people deciding certain actions in order to make an application more performant; e.g. putting application near databases, deciding never to use queue systems and instead, having the application handle queues, deciding not to rely on redis but instead, have the cache for api embedded into the application. There are definitely reasons for such technical decisions but those decisions should now be learning points of the “pain” of supporting such applications. Those decisions aren’t “wrong” per say; it just comes with trade offs. An example could be an application deciding to never rely on an external queue system such as kafka/redis/nats but instead, implement the queue system. One of the drawbacks is that now, application developers now need to support a functionality that is now generically available in the market; its a tech burden that the team has to carry - the self-implemented queue system better be worth it. Another reason could be that if the queue system is within the application - resources is needed to spent on that same application, teams need to ensure that availability of the app is higher than what they can afford. This is a consequence of having such a feature.
A common quote here is “premature optimization is the root of all evil”. Link It is better if we first build up the application with the “minimal” effort in order to run our application with business logic before we begin to start the whole optimization process. Maybe for the initial version of application, we can try to run the application without a cache but if we find that the performance of the apis that we’re providing is too horrible, maybe we can then consider the cache idea. Or we can also relook at some of the sql queries that are being run; maybe the query is selecting too many records in its first pass etc. It is better to do this rather than worry for months if a cache is needed than have constant arguments of whether to keep the cache within the application or maybe, offload the cache to a central redis server or a dedicated redis server for the application.
As an afterthought, I guess this kind of explains a bit of the whole of the tech sector moving towards containers. Applications within containers definitely take a performance hit (we’re traversing through another layer of abstraction), however, it makes it easier to understand what’s running in production if you’re able to isolate and encapsulate the runtimes of applications that to be shipped to production. The encapsulated container can be tested in various testing environments which makes it easier to understand what version of the application is in production and to ensure that the application dependencies are brought along with it. But of course, this is probably just one aspect to such a decision; there is definitely a variety of reasons that would result in a company to proceed and decide to utilize such technologies.