Encryption at Rest by default for all databases

Encryption at rest has been available on-demand via the support for a few months now. We recently decided to enable it by default for all new databases. In this article, you will learn what is Encryption at Rest and why it’s important for your business. We will also cover how we implemented it on Scalingo and why we decided to enable it by default for all our customers.

The security of your database’s data is a top priority for us. Database access is always encrypted with TLS connections when reached from outside Scalingo’s datacenter and can be enforced even inside the infrastructure. Credentials to authenticate are always generated randomly with a strong password. Communication between members of a database cluster is also encrypted by default. Of course database backups are encrypted and stored externally. Moreover our infrastructure provider contractually guarantees that when a disk goes to maintenance, all its data are erased. But encryption at rest is a good safeguard we can add to our infrastructure.

What is Encryption at Rest?

Encryption at rest is the encryption of database data when it’s persisted on disk. When Encryption at Rest is enabled, bits of data stored on disk are unreadable from someone having a physical access to the disk.

The purpose of encryption at rest is to provide protection of the data stored on disk against an attacker trying to obtain physical access to the hardware. In such a scenario, a server’s hard drive may have been compromised during a maintenance, allowing an attacker to read the stored bytes. With encryption at rest enabled, data are encrypted before being stored on the disk. If an attacker obtains a hard drive with encrypted data without access to the encryption keys, reading useful bytes is nearly impossible.

For this reason, we developed encryption at rest for some of our customers requiring top notch security of their data.

Scalingo’s Implementation

Some types of database are supporting encryption at rest natively (ie. PostgreSQL or MongoDB with WiredTiger storage engine), however these solutions are specific to these technologies and our goal has been to enable this feature for all the supported databases on the platform. Our choice has been to leverage a feature of the Linux Kernel allowing to encrypt data at the disk level: this functionality is named: dm-crypt.

This subsystem is available since the version 2.6 of Linux (released in 2003), it uses the device mapper infrastructure to present disk-like devices to the user where all the read and writes operations are transparently using decryption and encryption functions. Those devices are named LUKS (Linux Unified Key Setup) encrypted partitions and are commonly manipulated using the command line tool cryptsetup.

In parallel, the CPUs used in our infrastructure are Intel Xeon E5 which have been designed with dedicated instructions for encryption. Those instructions are automatically used by Linux to ensure the best performance possible.

The algorithm used is aes-xts-plain64 with a key size of 256 bit and hashed using SHA256. This method is considered secured and standard in the industry. To reduce the attack surface, each instance of each database has its own cryptographical key to protect its data, so getting access to one key wouldn’t allow an attacker to get plain data from another database. The keys are stored in a database which is itself encrypted and protected by authentication.

Impact on the Performance

Before enabling it by default for all databases, it was important to ensure that the impact on the application performance and on the infrastructure is not too important. We executed a bunch of experiments to gather the confidence about the impact of encryption at rest on performance.

Low-level Experiments

We first had a look at the impact of encryption at rest on raw operations, like reads and writes, without the overhead of a database. We used the tool dd to read and write bytes on an encrypted volume and an unencrypted volume. This tool outputs the number of bytes per second that were read or written. We used this information to compare the performance on both encrypted and unencrypted volumes.

We executed each experiments 15 times to gain confidence on the reproducibility of the results. We found out that the difference between writing on an encrypted and an unencrypted volume ranges from 0% to 2.8%. We even had a few run of experiments where the writes on an encrypted volume were slightly faster than the same writes on an unencrypted volume!

High-level Experiments

The low-level performance is reasonably low but we wonder whether adding the database layer modify the impact of read or write operations. We ran experiments on both MongoDB and PostgreSQL using respectively YCSB (Yahoo! Cloud Serving Benchmark) and pgbench. These softwares let us run some read-only benchmarks and read-write benchmarks on the databases and outputs the number of operations per second they achieved.

We executed each experiments 10 times. Each execution lasts about 1 hour. The results with this setup are similar to the low-level experiments: difference of performance between operations on an encrypted and unencrypted volume is acceptable, from 4% to 9%.

We also looked at the impact on memory and CPU consumption. Only the CPU usage seems to be impacted with a usage 3.70% higher when reading or writing on an encrypted volume as opposed to the same operation on an unencrypted volume.

Final Words: Activation by Default

The impact of encryption at rest on performance is rather low as compared to the benefits for our customer’s privacy. Hence, we decided to enable it for all databases newly created. This change occurred on the 6th of January 2019. For databases created before this date, encryption at rest is available on demand on the embedded support chat of the web dashboard, or by email to support@scalingo.com. Activating encryption at rest is achieved with a short downtime for your database, from seconds to minutes depending on the amount of data.

Photo by Chris Barbalis on Unsplash