Introducing High Availability for Scalingo Elasticsearch

Elasticsearch is a search engine capable of searching and indexing complex schema-free JSON documents available on Scalingo for more than 4 years now. Its true magic resides in its flexible, transparent and powerful clustering capabilities. As of today Elasticsearch clusters are generally available on Scalingo, adding High Availability (HA) and better performance for our current single node offering.

What are Elasticsearch Cluster?

Elasticsearch is a special type of database, optimized for full-text search. At its core, Elasticsearch embeds the concept of cluster and sharding:

  • An Elasticsearch cluster is composed of one or many nodes.
  • When adding an index to Elasticsearch, it is split into multiple shards.
  • Each of these shards can be replicated amongst the nodes.

Each node in the cluster is aware of the cluster topology and is able to redirect a request about an index to the correct node.

Using a multinode cluster provides redundancy and improved availability.

Elasticsearch Cluster on Scalingo

When you provision an Elasticsearch cluster on Scalingo, we start three Elasticsearch nodes to spread the indices on two of them. Thanks to this replication setup, if a node fails for whatever reason, your database will still be accessible without any downtime. From the client’s point of view (usually your application) there’s a single entry point to the cluster which takes care of the load balancing between all the nodes of the cluster. Thus the environment variable injected into the context of your application still contains only one IP address.

In order to get that running magically, a hell lot of work needed to be done! Here is the target cluster setup we provide:

Scalingo Elasticsearch network setup

When your application queries the Elasticsearch database, it enters the private network through a unique entry point, called Gateway internally, forwarding the requests to the Elasticsearch nodes and handling the authentication of the requests. To prevent this gateway to be a single point of failure, a second one is started as a failover in case the first one is down. We ensure the failover mechanism with LinK, a virtual IP manager. Currently, those entry points are HAProxy servers.

To increase the security of your cluster, nodes are booted in a dedicated private network based on VXLAN with the help of SAND, an autonomous service managing overlay networks.

SAND and LinK have been developed in-house to enhance the security and availability of the platform. Both projects have been open-sourced under the MIT license. We’ll dig deeper into this two projects in the coming weeks.

TLS Everywhere

As usual, the security of the communication to the database and between the cluster members is top priority for us. To ensure communication between the Elasticsearch nodes and between Elasticsearch nodes and the gateway is encrypted, we use the community edition of Search Guard extension. It implements transport layer encryption and REST layer encryption (TLS).

As usual, by default your database is not reachable from outside Scalingo’s network. As announced last year, to make it reachable from anywhere around the world, you first need to force TLS communication to connect to the Elasticsearch cluster.

How to Take Advantage of Elasticsearch Cluster?

This feature is available as part of new plans for your database. If you start a new database, you will be prompted to choose a plan. All the Business plans include cluster with three nodes holding the data, and two gateway nodes. The first cluster ready version is 5.5.3-7. It is the 7th Scalingo revision of the version 5.5.3 of Elasticsearch.

For an existing database, getting a cluster takes a few steps. First, upgrade to the latest 5.5.3-7 version. It will start your database with a mono-node cluster (i.e. an Elasticsearch node in its own private network with a single gateway). Then you can change the plan in the “Addons” section of your application to choose a Business plan. Your database will migrate seamlessly from a mono-node cluster to a three-nodes cluster.

After upgrading your plan to a Business plan, your database is highly available and Scalingo ensure a 99.99% availability of your data. Moreover, updating to a more recent version is achieved with zero downtime!

With these modifications, the way automated backups are done is modified. Indeed, Elasticsearch’s backup API requires a shared storage between all members of the cluster. We choose to upload all backups to an external object storage thanks to the S3 Repository Plugin. These backups flow through an internal proxy to be encrypted. Our Data Processing Agreement has been updated accordingly. Unfortunately, with this setup, it is no longer possible for you to download your backup. However we added a restore feature so that it is possible for you to restore the database from a previous backup. You don’t have to worry about data portability: your Elasticsearch database is completely standard (the base Docker images are open source) and you can retrieve all your data by querying it.

The dashboard for your database has been updated to help you grasp your Elasticsearch cluster vital statistics, cluster topology and cluster health:

Database dashboard revamped

Necessary pricing change

As everybody doesn’t need cluster, nor can afford a high availability setup, Scalingo Elasticsearch pricing have been completely overhauled.

There’s 3 pricing categories:

  • Sandbox contains only 1 plan and should only be used for development purposes. No SLA is offered on this plan.
  • Starter plans offers mono-node cluster deployment (no high availability) but all of them includes automatic daily backups.
  • Business plans are the ones you’re looking for if you’re doing production stuff: cluster all the way and improved availability.

You’ll find the new pricing grid below:

Scalingo Elasticsearch new pricing plans

As of today, all old plans are obsolete and are not available for new add-ons provisioning. As a matter of fact, all Scalingo Elasticsearch add-ons running the old free tier plan will be discontinued on March 31st and automatically transformed into the new Sandbox plan.

If you have questions or remarks regarding the new pricing plans or the new cluster feature, don’t hesitate to contact us. We’d love to hear your feedbacks!

What’s coming up next?

We now have developed all the necessary plumbing to ensure private network sealing with SAND and to ensure high availability IP address floating with LinK. This will enable us to provide more high added value services in the future. Among them Redis and PostgreSQL high availability are knocking at Scalingo’s door. Stay tuned in the next couple of months!

Photo by Gabriel Sollmann on Unsplash