“How can we prepare our app to handle a million users?”
This is one of the most common questions prospective clients ask us during discovery calls.
Making the right technical decisions around scalability can enable millions in recurring revenue — or cost millions in churn when the app breaks down after a TechCrunch feature.
Here at Rootstrap, we’ve worked on dozens of fast-growth products. In 2019 we helped Globalization Partners raise revenue 300%, in 2020 we helped Ownable handle a 4X Black Friday surge, and as of 2021 we’ve helped Masterclass go from 200,000 to well over 4 Million visitors and 1 Million concurrent users.
We’ve learned a lot of lessons about scaling complex applications along the way.
In this article, I’ll share 14 critical principles for teams and founders planning for scalability, including:
- Core scalability principles business managers need to know.
- The most common misconceptions about scaling apps.
- Technical staffing considerations for fast-growth teams.
- Two specific techniques for managing rapid user or traffic increases.
I’ll also cover some of the unexpected tactics used by apps like Superhuman and Clubhouse to handle user surges from international press coverage.
Core principles: Three keys to app performance at scale
There are three factors that actually move the needle for high-scale applications going from zero to millions of users:
- Code quality: Code is written in a way that is efficient, useful, and maintainable over the long term.
- High-level architecture: application components must be arranged optimally for the business case.
- Infrastructure planning: application must have adequate server capacity and the ability to react to increased usage.
Get these three areas right, and your application will handle all the traffic your marketing team can throw at it.
Code Quality: what it is and why it matters
“Code quality” can be a confusing term when you first hear it mentioned by your engineering team. What does it actually mean?
Code quality means that regardless of the language or frameworks being used in your application, the code follows best practices to run as efficiently as possible.
High-quality code has these features:
- Reusable: good code can easily be used again, and has characteristics like modularity and loose coupling.
- Extensible: good code is resistant to change. Engineers call this quality extensibility – extensible code will allow you to add or remove features quickly without introducing bugs.
- Readable: good code is easily, quickly, and clearly understandable by someone new to the project. It should maintain a consistent style so that contributors can easily spot issues.
- Documented: good code is documented through comments and external wikis so that the project doesn’t experience “brain drain” if certain key contributors disappear.
- Robustness: good code should be structured with testing in mind, so that code can be deployed with high confidence that it won’t break other features. Error handling has to be implemented for both common and uncommon conditions.
If you’ve worked on an inherited codebase where the structure of the repository was confusing and engineers “didn’t know what they didn’t know,” then you’ve experienced code quality issues first-hand.
Poor code quality ultimately results in a high level of technical debt, which can sink a company even faster than financial debt.
Needless to say, apps with poor code quality cannot scale to millions of users; or at the very least, are unnecessarily expensive to grow and maintain.
High-level architecture: what it is and why it matters
Deciding what framework to use may have a low impact on scalability, but landing on the right high-level architecture absolutely moves the needle.
The most common high-level application architectures currently are:
- Microservices: apps broken into multiple independent parts to increase maintainability and scalability.
- Monoliths: traditional system design framework that sacrifices scalability for simplicity.
- Serverless: a newer paradigm in which all functions are completely outsourced to cloud vendors like AWS Lamda.
There are hundreds of smaller decisions to make once your engineering team decides on an application architecture, but the high-level architecture will have the most impact on your ability to scale quickly.
As of 2021, the majority of high-scale applications you’re familiar with like Instagram, Postmates, etc, are built using a microservices architecture approach.
Microservices are favored because they allow dynamic scaling of individual parts of an app, as well as reducing downtime when pushing feature updates.
Microservices architecture also allows developers to use different frameworks for different functions. For example, a machine learning recommendation engine could be added to an existing e-commerce application using whatever framework or language has the best libraries and support for that particular algorithm — rather than shoehorning it into an existing language/framework.
Application Infrastructure: What it is and why it matters
Finally, it’s critical to consider infrastructure — meaning the servers, CDNs, and other physical infrastructure used to deliver your app’s functionality to users.
No matter how clean your code is, it can’t run properly on insufficient hardware.
There is a common misconception that your biggest infrastructure consideration is which cloud vendor you choose.
In actuality, the choice between cloud infrastructure vendors like AWS, Google Cloud Platform, and Azure does not materially impact your ability to scale. Furthermore, the costs associated with each vendor are equivalent and have been dropping rapidly over the past decade.
What matters most is how your servers interact with your application architecture: can the app allocate more servers to meet a traffic surge? What happens if a particular feature gets more usage than others? What happens if a server goes down?
At a high level, this comes down to deciding whether an application (or service within your app) will scale horizontally or vertically.
- Horizontal scale: applications designed to run on multiple server instances running in parallel.
- Vertical scale: applications designed to run on a single server that adds CPU, RAM, and other resources as needed to meet demand.
In almost all cases, horizontal scalability is more desirable for applications that expect rapid, sustained growth. The ceiling on how many users a horizontal system can handle is much higher (essentially infinite), although it is more complex to architect and can lead to challenges with data consistency between instances.
In terms of vendors, your main choice that impacts scalability is whether to use cloud infrastructure vendors (AWS, GCP, Azure) or run your own servers on-premise in your office or rented data center space:
- IaaS (Infrastructure as a Service): vendors like AWS and Azure that have large data center holdings and rent out server space to companies. IaaS setups can be either “bare metal” meaning you rent dedicated servers, or virtualized such that multiple tenants can share hardware.
- On-premise: term used when a company manages their own physical servers in their office or at a data center. This is usually only done for security purposes, as cloud services are almost always much cheaper and less prone to downtime.
In almost all cases, cloud Infrastructure as a Service (IaaS) vendors are the way to go, since they can hold content in data centers closer to your users and are less prone to issues with servers going down when compared with self-managed setups.
Myth Busting: Three key misconceptions about scaling applications
The benefit of being an agency is that we get to work with lots of different companies at different levels of scale.
While most of our clients are established mid-market companies already serving thousands or millions of users, from time to time we work with new startups having their first experience with scaling tech products.
Here are the top three misconceptions we see when consulting with first-time entrepreneurs or managers planning high-scale applications:
1. “We need to prepare for a million users.”
Application load depends on how many users are active at any one time; not on the total number of registered users.
Keep in mind that a million sign-ups almost never result in a million active (or more specifically, “concurrent”) users, and most apps can take a longer time ramping up than they think.
We call this “test before you invest,” meaning that companies should have proof that they can grow to a million users before investing in the (expensive) development needed to support the userbase.
2. “What is the most scalable platform, Django or Ruby on Rails?”
Unless you’re in a space like high-frequency trading where micro-milliseconds matter, framework choice is unlikely to be the bottleneck — this choice is more about developer comfort than performance, provided that high-quality code is deployed to the framework in use.
3. “We won’t need developers once the app is built.”
Even with large corporate clients, we’ve found that technical maintenance and ongoing feature improvements are often an afterthought; especially when planning a budget.
Modern teams use DevOps frameworks to ensure continuous deployment of valuable, tested features. If your app is a success, your development team will be expanding after launch to support product maintenance and growth; not decreasing.
Core Principles summary: what is the best tech choice for a high-scale application?
To summarize, virtually all modern high-scale applications are:
- Developed using DevOps pipelines to ensure high code quality that performs effectively under heavy load.
- Designed using a microservices architecture to ensure flexibility when increasing server power.
- Deployed on horizontally scaled servers behind a load balancer to ensure low downtime and high ceiling on server capacity.
Scaling to millions of users: technical staffing considerations
One of the biggest challenges of scaling to millions of users is bringing on experienced engineers fast enough to meet application demand.
It’s very common for startup founders and even experienced corporate teams to try and shave costs by using an unproven offshore team in a remote time zone like India or Pakistan… only to have their product wind up crashing when they start onboarding users.
We call these “rescue projects” when they inevitably wind up on the doorstep of highly-ranked agencies like Rootstrap.
A little bit of staff planning goes a long way when it comes to supporting scalability. The most common patterns we see are:
1. In-house teams
Hiring in-house teams is a common approach for companies experiencing rapid scale.
- Benefit for scalability: Keeps technical knowledge fully housed within the company.
- Drawback for scalability: Onboarding staff can take months and comes with high overhead costs, especially when growing rapidly. Developer talent shortages in the US are also a key issue here.
2. Managed Services
In the managed services model, a company completely outsources the technical development and maintenance of their application. This is the most expensive approach, but is commonly used by large corporations for the benefit of avoiding in-house staff and the ability to switch vendors as needed.
- Benefit for scalability: Peace of mind from fully outsourcing to a proven vendor rather than trying to hire and build in-house.
- Drawback for scalability: Lack of control over the technical aspects of your product, and a tendency to be less agile in the market.
3. Freelance hiring
The benefit of hiring one-off freelancers is that you can access worldwide talent and select exactly the skills you need. However, this option comes with major liability concerns for mid-large companies. On the whole, only small startups should use individual freelancer contracts to build an IT team.
- Benefit for scalability: Low costs and flexibility on hiring/firing.
- Drawback for scalability: Legal liability from operating individual contracts and difficulty sourcing/managing talent.
3. IT Staff Augmentation
Staff augmentation is exploding in popularity post-pandemic; in this model, a company maintains a small “core” team locally in their office, and contracts out the “heavy lifting” to an IT staffing company.
This is one of our most popular services here at Rootstrap, where we support high-scale companies like Masterclass using a staff augmentation model.
- Benefit for scalability: Rapid on-boarding of staff, flexibility to scale the team up or down as needed.
- Drawback for scalability: Reliance on a third-party team for building and maintaining critical business functions.
Staff augmentation is particularly well-suited when companies need to rapidly hire several developers with specific low-supply skills such as React Native.
How to handle rapid user growth: two specific techniques
Before we close out, let’s take a look at two specific examples of how high-scale companies achieve and manage scalability:
- Preparing for server outages under heavy load
- Avoid sudden unsustainable spikes in traffic
Chaos Monkey: how Netflix prepares for server outages
Rather than waiting for servers to go down and fixing them as it happens, Netflix actually built an internal program that makes server instances go down on purpose.
The application, Chaos Monkey, is designed to help Netflix engineers build systems that react well to random outages.
This approach allows them to experience a few small, controllable issues in the short term to avoid large, uncontrolled outages in the future.
Artificial Scarcity: how apps like Clubhouse build hype while buying time
When Elon Musk joined Clubhouse, thousands of listeners found themselves locked out of the room and unable to attend.
Millions more hadn’t even been able to secure an invite to log into Clubhouse at all.
Clubhouse executive Sriram Krishnan took to Twitter to assure users that they had it under control; but what’s really surprising is that an app that was in the news every day for months during Covid lockdowns hadn’t suffered more outages.
The secret is, limiting growth using an exclusive invite strategy isn’t a bug — it’s a feature.
Companies like Clubhouse know that if they let everyone that wants to use their app on at once, they’ll be overloaded with scalability issues and see a giant spike followed by a large dip in active users — which is as disappointing for investors as it is for users.
Clubhouse isn’t the only trending app to use this approach. The elite email client Superhuman is another example of a product using controlled hype and exclusive access to make sure growth stays exponential, or at least linear — not spikey and inconsistent with large downtrends. Gmail also used this technique when they launched back in 2004.
Facebook also famously used this method to control growth, iterating on their product as they moved from campus to campus until the product was ready for “prime time” scaled release to the general public in the mid-2000s.
- Good: controlling user signups with invites to ensure app performs well and growth compounds up and to the right.
- Bad: letting everyone sign up at once, resulting in laggy app performance and fewer product iterations before the app goes mainstream.
Closing Advice for fast-growth teams
In closing, companies that successfully scale to a million users and beyond have these key characteristics:
- Strong DevOps pipeline and high code quality.
- Well-architected system design on robust infrastructure.
- An experienced technical team and the ability to rapidly onboard new developers as needed.
- A plan for controlling growth and avoiding unsustainable traffic spikes.
For first-time founders, the most important thing to remember is to “test before you invest.” It’s better to rapidly re-build a scrappy app seeing rapid growth than to overbuild a beautiful scaled system that no one actually signs up for.
You don’t build a Boeing to fly 15 miles. First, you build a car. Once you’ve gone as far as it can go, then you bring on engineers and start to scale.