Deciding between horizontal and vertical scaling is an important infrastructure consideration when building out applications because it determines how your application will increase its computing resources to handle growth.
In simple terms, horizontal and vertical scaling are two strategies for adding computing resources to run your app as demand increases.
The term “horizontal scaling” means that you add more machines as needed; you had one server running your app, now you have several running in parallel. The term “vertical scaling” describes adding power to your existing machine; you have one server, and you add more RAM and CPU resources.
Making poor decisions around server scaling can add tens of thousands of dollars to your monthly server costs; or worse, cause your app to crash or slow down under heavy load.
As a manager, understanding the difference between horizontal and vertical scaling strategies will help you collaborate with your engineering team members. Making the right choice can be critical when making decisions around:
- Microservices vs Monolithic system design
- Cloud platforms and cost
- Database structure and developer preferences
Horizontal scaling and hybrid approaches are more popular, but there are still situations where vertical scaling makes more sense, particularly for internal applications or small and low-cost projects.
Horizontal vs Vertical Scaling: a Quick Primer
A common misconception among non-technical contributors is that scalability decisions impact what framework or language you choose for a project.
In fact, the tech that allows an application to scale all happens on the back-end architecture side, meaning the servers used to run the app and languages used for interacting with the database.
You can visualize it like this: once you build a piece of software, you need a piece of hardware to run it. Any user action, such as logging in or updating an account, will require a server that receives the user request and returns the data needed to complete the request.
As you add users, the requests increase, and at a certain point, the server will struggle to process them quickly enough.
At that point, the server needs to be scaled; either horizontally by adding more servers, or vertically by adding power to the existing server.
The framework used on the front-end of the app (ReactJS for example) often does not matter; scalability is almost entirely a back-end capacity question, which is often decoupled from the infrastructure driving the app’s ability to scale.
High-quality code in your application is critical to running efficiently, but even a perfectly tuned application requires increased server power when it scales from a few hundred concurrent users up to millions.
The choice between horizontal and vertical scaling frequently comes up when choosing a database. Some databases like MongoDB are better designed to run on distributed architecture (horizontal scaling), while others like MySQL work better on a vertical scaling model.
Common misconceptions about application scalability
To recap, here are the most common misconceptions non-technical contributors have when encountering scalability in team discussions:
- Misconception: choosing the right app platform matters for supporting a million users. In most cases, Ruby on Rails vs Flask is not a scalability question; it’s a developer comfort question. Any app will struggle to perform if it doesn’t have enough server power to support all its users. You may get more “bang for your buck” out of a Flask app in terms of optimizing around server costs, but both frameworks are perfectly capable of scaling to a million users and beyond — when paired with the right infrastructure on the back end.
- Misconception: choosing a bad cloud platform will limit your scalability. Cloud providers like AWS and Microsoft Azure are just a way to purchase server power virtually, rather than having physical servers in your closet. They are equally capable of handling high-scale applications in 99.9% of situations.
- Misconception: a million user accounts requires a million dollars in AWS fees. A million user accounts can be handled by a small server if they are not using the app at the same time, or if the app has very simple, light request loads. An app with many concurrent users at the same time requires more server power than an app with a million users who only open it once in a while.
Overview of Horizontal vs Vertical scaling
|Architecture Consideration||Horizontal Scaling||Vertical Scaling|
|Traffic management||Load balancer||Manual upgrades (downtime)|
|Resiliance to outages||High: multiple servers and automatic upgrades if one goes down||Low: single point of failure and required downtime to make changes|
|Data consistency||Low: can have issues with data consistency between servers (sharding considerations)||High: all data in one place|
|Userbase limits||Practically speaking, only limited by financial ability to buy more servers||Hardware limits of a single machine|
|Database programs||MongoDB, Casandra||Amazon RDS, MySQL|
Horizontal vs Vertical Scaling: which is more popular?
Horizontal scaling is almost always more desirable than vertical scaling because it has more elasticity.
Building software that can scale horizontally is often a bit more complex. So, sometimes, for small applications, it’s cheaper to build monolithic designs (which are less likely to scale horizontally) and scale vertically as needed. After a certain threshold, vertical scaling costs skyrocket.
Also, oftentimes, you find that even with horizontal scaling, each server is hitting some limits (memory, CPU, etc). That’s the time to upgrade its processing power through vertical scaling
Horizontal vs Vertical Scaling: Pros and Cons
Before we get into the pros and cons of scaling horizontally vs vertically, note that horizontal scaling is by far the most common choice for fast-growth and enterprise applications.
While horizontal server architecture is overall more complex, through the introduction of load balancers to handle requests and “sharding” of data between multiple servers, the payoff is that it can automatically scale if the user base exceeds expectations. It’s also more resilient to random hardware failure, meaning that if a server dies, another one can automatically be provisioned to pick up the slack.
There are exceptions, but in most cases, the ability to rapidly scale up computing resources based on demand without downtime makes horizontal scale most sensible, particularly for consumer-facing applications.
Pros and cons of horizontal scaling
The main pros of horizontal scaling are:
- Automated server increase to match usage
- Low downtime, no downtime needed for server upgrades
- Resilient to random hardware failures
The main cons of horizontal scaling are:
- Data consistency can be challenging across multiple machines (joins require cross-server communication)
- Cost may be higher, and more code may be required
- Servers may still encounter hardware limit issues if machines are too small
Pros and cons of vertical scaling
The main pros of vertical scaling are:
- Simplicity, since everything is running on one machine (assuming you’re not vertically scaling machines in a horizontal architecture)
- Lower server costs
- Fewer data consistency issues, since data is all on one machine
- Relatively little code change needed to scale to larger machine
The main cons of vertical scaling are:
- Manual work usually needed to upgrade to larger machine
- Downtime from server changes
- Vulnerability to downtime if hardware fails
Summary: Pros and Cons of vertical vs horizontal scaling
|Complexity||Relatively simple since all data is on one machine||Requires load balancer and code for managing data consistency|
|Cost||Tends to be lower||Tends to be higher|
|Rapid growth handling||Manual, inflexible||Automatic, flexible|
|Data consistency||Not an issue||Can be an issue|
|Downtime||Downtime for server changes||Highly resilient, low downtime|
|Vulnerability||Vulnerable to random hardware failure||Less vulnerable to hardware failure|
Horizontal scaling hybrids
In many cases, the question of whether to scale horizontally or vertically isn’t a black and white choice.
“Hybrid” scaling is increasingly popular, particularly since cloud platforms make provisioning machines simple and cost-effective.
In simple terms, hybrid scaling means using larger servers in a horizontal architecture; this makes sense for fast-growth consumer startups and SaaS companies, who can take advantage of the benefits of large machines without sacrificing the flexibility of a horizontal approach.
Horizontal vs Vertical scaling for Microservice architecture
Microservice application architecture is increasingly popular in 2021. There are two primary reasons for their popularity:
- Microservices allow developers to build the most resource-intensive or high-traffic parts of an application independently, which makes them straightforward to scale relative to scaling the entire application.
- Microservices allow the functions of an app to be abstracted, such that they can be built independently using different languages or even outsourced to third parties.
Microservice architectures are generally seen with large, high-demand horizontally-scaled systems like Netflix, Uber, and Amazon. However, they actually enable much more choice about how to scale each part of an application.
To give an example, imagine you have an app with chat functionality, and you find that usage of the chat feature is exploding — far outpacing your assumptions. At the same time, other parts of your app, such as the in-app store, are not receiving heavy usage.
A microservices architecture built to scale horizontally would allow you to rapidly increase servers powering the chat component of the app. Meanwhile, the store component could be left as-is, or even down-scaled if desired to reduce server costs.
In the context of Facebook Ad buying, scaling horizontally means to expand a campaign outward through lookalike audiences or new geographies. Scaling vertically refers to increasing spend on an individual campaign budget. The term has no relation to the use of horizontal vs vertical scaling in software architecture and system design.
Distributed systems may incorporate horizontal scaling for some components, but they are not the same thing. Nodes in a distributed system communicate via a network directly, while horizontally scaling refers to multiple cloned machine instances behind a load balancer.
The term “horizontal scaling” means that you add more machines as needed; you had one server running your app, now you have several. The term “vertical scaling” describes adding power to your existing machine; you have one server, and you add more RAM and CPU resources.