Scaling is being able to handle more traffic (more page views, more simultaneous users) without service degradation. It is needed when your service works well and attract new users, your growth is viral, your app appears on a TV show or just after you shoot an email to your 250k prospects. To be able to scale, you and your app must be prepared. While many factors can hinder your ability to scale, in this article we’ll cover the 12 most common and how to fix them.
Many cloud platforms, among them Platform as a Service like Scalingo, enforce good behaviors. Those behaviors make your app “cloud ready” and help make it scalable. In this article we’ll cover 12 reasons why your app won’t scale. While most of them are not related to a specific language or technology, we’ll try to give you language specific clues to follow when it’s appropriate. It’s your checklist to really enjoy your journey on a Platform as a Service (or any cloud platform). If you correct your app behaviors, it will be able to scale, and at the same time, make it runnable pretty much anywhere.
While there’s two main types of scaling, vertical scaling (use a more powerful box than the current one running your app) and horizontal scaling (use more instances running your apps), the following reasons apply to both.
First things first. It should be obvious but it’s always better to say it: if you want your app to scale, you have to measure resource consumptions in order to know when you should scale and what is the “scaling pattern” of your app. By scaling pattern, I mean that every app has its own way of scaling. You cannot just take one pattern you discovered in an app and apply it blindly to another app. Resource consumptions should include CPU/RAM/Swap usage of your application code and databases, network traffic in and out of your app to your end users or external resources.
The easiest way to scale is to don’t scale (!) by saving web requests for. One way to do this is to minify your static assets. Static assets include CSS stylesheets, javascript files or images used by your User Interface (UI). Minifying is the act of combining several files into one (so instead of one web request per file, you end up having just one web request in total). It’s very easy for CSS and Javascript files. Tools like Rails’s asset pipeline or grunt can help you doing this.
Combining several images into one CSS sprite it’s a little bit more complicated and may not be feasible at all. However, when your traffic is in millions of web requests, it’s the little addition that can count a lot!
Corollary to the previous point, once your assets are minified they’d better be served by a CDN (Content Delivery Network). They are built to scale, cheap and easy to use. A CDN will send your assets more quicky to your users by caching your assets all over the world and detecting which cache is the closer to each of your end users. Cloudfront and Cloudflare are two very good CDN.
Sending less files, less bytes per file and more quickly will also improve the experience of your users.
Of course you think that you’ve done it in your app. But don’t be so sure.
But first, let’s describe the problem. Many databases includes the concept of indexes. It’s a way to help them reach specific data you’re looking for with your (SQL) queries. It’s a manual step a developer has to take to teach the database which data have to be indexed.
Despite the general comprehension of the problem, the reason why you still see the “index problem” in the wild it’s because it can have very different forms. The first one, and most obvious, is generally that you forget to setup some indexes and as soon as your app handles a little bit more traffic than the usual, it’s becoming extremely slow. The problem of lacking indexes arise when you cross a certain landmark in term of number of elements in a collection (or table). Other cases involved some operations that you designed a long moment after indexes have been created and only some of them can be quite slow. Again, if it’s only a few operations in parallel, you won’t notice that the indexes are not in place or that you could design more complex indexes.
And do you know that some databases have geospatial indexes? Stop reinventing the wheel :)
They consist in all the external things your app needs to work properly apart from your own code. It could mean a database, a fulltext search engine, an external storage facility (AWS S3), APIs (Facebook, Twitter), etc. In an era where everything is an API, your app certainly depends on external resources. Even if your code can scale very well, maybe your external resources can’t. Maybe you depend on an internal microservice that cannot handle the load you will send to it.
Or maybe it’s not the external resource itself that cannot scale but how you access it. This problem is especially true for language or technology that are not built with high concurrency in mind like PHP, Ruby, etc. It’s the primary reason of NodeJS, Go or Elixir/Phoenix adoption.
To improve the concurrency of your app, the most common answer is to take blocking parts of your code and make them asynchronous. To illustrate this point, let’s take two things your app surely do and that should certainly be done asynchronously: sending emails and creating image thumbnails. Indeed, if you do it synchronously, everytime an action triggered by a user sends an email, you app server will have to wait for your email server to respond and process the mail sending. In the Ruby world, Sidekiq is the most famous library to handle things asynchronously.
Imagine your app receives some requests: for example a user surfing your website. At one point, this user authenticates themselves to your app. On each page visited by this user, you want him to stay connected. Your app puts some informations in memory and exchange a unique cookie with this user effectively creating a user session. Every time this user visits a page, his browser sends back this cookie and you can find back his personal informations in the session memory.
Now you want to scale: instead of one server, you need two. If each instance has its own session memory, depending on which instance receives the traffic, the user won’t be authenticated. User sessions must be shared among all servers (or instances). The easiest way to achieve this is to save them in a database. Even though you can technically use any database, you will usually choose a very fast in-memory key/value database like Redis which works really well for this use case.
Writing on a filesystem, e.g. accessing a disk drive, is one of the slowest thing an application can do. Indeed, in a cloud environment you won’t have access to an ultra fast SSD directly attached to your machine. That’s why you want to avoid it as much as possible. For example, don’t store your assets (images, PDFs, excel files) on a disk drive. Sent them to an external service specifically built for this task like Amazon S3. Also don’t use SQLite (!). Don’t store or read configuration files, use environment variables or databases. Likewise, don’t let your app fill its RAM and swap. It can be a nightmare.
Don’t let your app (…) swap. It can be a nightmare.
While it help you scale, having less moving pieces will also help you move your application code from one place to another.
Scaling should be painless and nearly instant. During the build stage your app is built and all its dependencies (libraries, assets) are searched, prepared or packaged. The run stage is when you actually run your application code. Those two stages are often the same when you begin the work on your app.
The problem arises when you add one more servers to your mix. That means that all dependencies of your app must be fetched on each servers of your cluster. Same for the assets building phase. It increases the time needed to scale. While tools like Capistrano will help you deploy faster, they won’t help you separate the build and run stages.
Docker helps you separate the build and run stages
That’s why your build and run stages should be independant from each other. Docker helps you separate the build and run stages.
Scaling implies booting new servers or instances. Once they have booted, you have to provision them eg installing all the software and libraries needed by your app to correctly run on them. Executing manually each step to provision a new box is too cumbersome, long, error prone. You have to automate provisionning. Either via shell scripts or better, via an automation tools like chef or puppet.
Inspecting logs is not really a mundane task. But sometimes you will have to do it: to trace back bad behaviors of your code or malicious usage of your app. At first things are simple: you have just one instance running your application code. To inspect the logs, you SSH to your server and access the file directly. As soon as you scale, things get more complicated. How do you find the one trace you need on many servers? You must have a centralized way to aggregate your logs.
Database migration (or schema migration) refers to the management of incremental, reversible changes to relational database schemas (thanks Wikipedia). A migration is a one-time task performed just after a new release of your app. It’s a series of SQL commands, like CREATE or DROP, usually surrounded by an SQL transaction. While it takes only a few milliseconds on a small database, it can block entire tables for a very long time once your database has reached a few dozens GB of data. Specific tools are then needed to achieve zero downtime schema migration. The most recent one is gh-host from the fine Github folks.
We’ve seen a bunch of stuff that should be fixed in your app and infrastructure before thinking of scaling. I hope that you’re not too gloomy: you don’t have to do all this stuff all at once! As a further read, you should take a look at the 12 factor manisfesto.
Furthemore, hosting your app on a Platform as a Service like Scalingo will help you a lot into scaling. Points #1 (Metrics view), #9 (Base Docker Image), #10 and #11 are all included in the platform, no need to worry about them. However, for your app to behaves correctly on a PaaS, first you must fix points #7 and #8.
And because you have read this entire blog post until the end, save 20€ on your next bills with the ATTENTIVEREADER voucher code. SIGNUP NOW!
At Scalingo (with our partners) we use trackers on our website.
Some of those are mandatory for the use of our website and can't be refused.
Some others are used to measure our audience as well as to improve our relationship with you or to send you quality content and advertising.