Redis or Memcached for caching

Memcached or Redis? It’s a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step taken, and Memcached or Redis are typically the first places to turn.

These renowned cache engines share a number of similarities, but they also have important differences. Redis, the newer and more versatile of the two, is almost always the superior choice.

The similarities

Let’s start with the similarities. Both Memcached and Redis serve as in-memory, key-value data stores, although Redis is more accurately described as a data structure store. Both Memcached and Redis belong to the NoSQL family of data management solutions, and both are based on a key-value data model. They both keep all data in RAM, which of course makes them supremely useful as a caching layer. In terms of performance, the two data stores are also remarkably similar, exhibiting almost identical characteristics (and metrics) with respect to throughput and latency.

Both Memcached and Redis are mature and hugely popular open source projects. Memcached was originally developed by Brad Fitzpatrick in 2003 for the LiveJournal website. Since then, Memcached has been rewritten in C (the original implementation was in Perl) and put in the public domain, where it has become a cornerstone of modern Web applications. Current development of Memcached is focused on stability and optimizations rather than adding new features.

Redis was created by Salvatore Sanfilippo in 2009, and Sanfilippo remains the lead developer of the project today. Redis is sometimes described as “Memcached on steroids,” which is hardly surprising considering that parts of Redis were built in response to lessons learned from using Memcached. Redis has more features than Memcached and is, thus, more powerful and flexible.

Used by many companies and in countless mission-critical production environments, both Memcached and Redis are supported by client libraries in every conceivable programming language, and it’s included in a multitude of packages for developers. In fact, it’s a rare Web stack that does not include built-in support for either Memcached or Redis.

Why are Memcached and Redis so popular? Not only are they extremely effective, they’re also relatively simple. Getting started with either Memcached or Redis is considered easy work for a developer. It takes only a few minutes to set up and get them working with an application. Thus, a small investment of time and effort can have an immediate, dramatic impact on performance — usually by orders of magnitude. A simple solution with a huge benefit; that’s as close to magic as you can get.

When to use Memcached

Because Redis is newer and has more features than Memcached, Redis is almost always the better choice. However, Memcached could be preferable when caching relatively small and static data, such as HTML code fragments. Memcached’s internal memory management, while not as sophisticated as that of Redis, is more efficient in the simplest use cases because it consumes comparatively less memory resources for metadata. Strings (the only data type supported by Memcached) are ideal for storing data that’s only read, because strings require no further processing.

That said, Memcached’s memory management efficiency diminishes quickly when data size is dynamic, at which point Memcached’s memory can become fragmented. Also, large data sets often involve serialized data, which always requires more space to store. While Memcached is effectively limited to storing data in its serialized form, the data structures in Redis can store any aspect of the data natively, thus reducing serialization overhead.

The second scenario in which Memcached has an advantage over Redis is in scaling. Because Memcached is multithreaded, you can easily scale up by giving it more computational resources, but you will lose part or all of the cached data (depending on whether you use consistent hashing). Redis, which is mostly single-threaded, can scale horizontally via clustering without loss of data. Clustering is an effective scaling solution, but it is comparatively more complex to set up and operate.

When to use Redis

You’ll almost always want to use Redis because of its data structures. With Redis as a cache, you gain a lot of power (such as the ability to fine-tune cache contents and durability) and greater efficiency overall. Once you use the data structures, the efficiency boost becomes tremendous for specific application scenarios.

Redis’ superiority is evident in almost every aspect of cache management. Caches employ a mechanism called data eviction to make room for new data by deleting old data from memory. Memcached’s data eviction mechanism employs a Least Recently Used algorithm and somewhat arbitrarily evicts data that’s similar in size to the new data.

Redis, by contrast, allows for fine-grained control over eviction, letting you choose from six different eviction policies. Redis also employs more sophisticated approaches to memory management and eviction candidate selection. Redis supports both lazy and active eviction, where data is evicted only when more space is needed or proactively. Memcached, on the other hand, provides lazy eviction only.

Redis gives you much greater flexibility regarding the objects you can cache. While Memcached limits key names to 250 bytes and works with plain strings only, Redis allows key names and values to be as large as 512MB each, and they are binary safe. Plus, Redis has five primary data structures to choose from, opening up a world of possibilities to the application developer through intelligent caching and manipulation of cached data.

Beyond caching

Using Redis data structures can simplify and optimize several tasks — not only while caching, but even when you want the data to be persistent and always available. For example, instead of storing objects as serialized strings, developers can use a Redis Hash to store an object’s fields and values, and manage them using a single key. Redis Hash saves developers the need to fetch the entire string, deserialize it, update a value, reserialize the object, and replace the entire string in the cache with its new value for every trivial update — that means lower resource consumption and increased performance.

Other data structures offered by Redis (such as lists, sets, sorted sets, hyperloglogs, bitmaps, and geospatial indexes) can be used to implement even more complex scenarios. Sorted sets for time-series data ingestion and analysis is another example of a Redis data structure that offers enormously reduced complexity and lower bandwidth consumption.

Another important advantage of Redis is that the data it stores isn’t opaque, so the server can manipulate it directly. A considerable share of the 180-plus commands available in Redis are devoted to data processing operations and embedding logic in the data store itself via server-side Lua scripting. These built-in commands and user scripts give you the flexibility of handling data processing tasks directly in Redis without having to ship data across the network to another system for processing.

Redis offers optional and tunable data persistence designed to bootstrap the cache after a planned shutdown or an unplanned failure. While we tend to regard the data in caches as volatile and transient, persisting data to disk can be quite valuable in caching scenarios. Having the cache’s data available for loading immediately after restart allows for much shorter cache warm-up and removes the load involved in repopulating and recalculating cache contents from the primary data store.

Data replication too

Redis can also replicate the data that it manages. Replication can be used for implementing a highly available cache setup that can withstand failures and provide uninterrupted service to the application. A cache failure falls only slightly short of application failure in terms of the impact on user experience and application performance, so having a proven solution that guarantees the cache’s contents and service availability is a major advantage in most cases.

Last but not least, in terms of operational visibility, Redis provides a slew of metrics and a wealth of introspective commands with which to monitor and track usage and abnormal behavior. Real-time statistics about every aspect of the database, the display of all commands being executed, the listing and managing of client connections — Redis has all that and more.

When developers realize the effectiveness of Redis’ persistence and in-memory replication capabilities, they often use it as a first-responder database, usually to analyze and process high-velocity data and provide responses to the user while a secondary (often slower) database maintains a historical record of what happened. When used in this manner, Redis can also be ideal for analytics use cases.

Redis for analytics

Three analytics scenarios come immediately to mind. In the first scenario, when using something like Apache Spark to iteratively process large data sets, you can use Redis as a serving layer for data previously calculated by Spark. In the second scenario, using Redis as your shared, in-memory, distributed data store canaccelerate Spark processing speeds by a factor of 45 to 100. Finally, an all too common scenario is one in which reports and analytics need to be customizable by the user, but retrieving data from inherently batch data stores (like Hadoop or an RDBMS) takes too long. In this case, an in-memory data structure store such as Redis is the only practical way of getting submillisecond paging and response times.

When using extremely large operational data sets or analytics workloads, running everything in-memory might not be cost effective. To achieve submillisecond performance at lower cost, Redis Labs created a version of Redis that runs on a combination of RAM and flash, with the option to configure RAM-to-flash ratios. While this opens up several new avenues to accelerate workload processing, it also gives developers the option to simply run their “cache on flash.”

Open source software continues to provide some of the best technologies available today. When it comes to boosting application performance through caching, Redis and Memcached are the most established and production-proven candidates. However, given its richer functionality, more advanced design, many potential uses, and greater cost efficiency at scale, Redis should be your first choice in nearly every case.

Top Linux Server Distributions

You know that Linux is a hot data center server. You know it can save you money in licensing and maintenance costs. But that still leaves the question of what your best options are for Linux as a server operating system.

We have listed the top Linux Server distributions based on the following characteristics:

  1. Ease of installation and use
  2. Cost
  3. Available commercial support
  4. Data center reliability
Ubuntu LTS

Ubuntu

At the top of almost every Linux-related list, the Debian-based Ubuntu is in a class by itself. Canonical’s Ubuntu surpasses all other Linux server distributions — from its simple installation to its excellent hardware discovery to its world-class commercial support, Ubuntu sets a strong standard that is hard to match.

Ubuntu

The latest release of Ubuntu, Ubuntu 16.04 LTS “Xenial Xerus,” debuted in April 2016 and ups the ante with OpenStack Mitaka support, the LXD pure-container hypervisor, and Snappy, an optimized packaging system developed specifically for working with newer trends and technologies such as containers, mobile and the Internet of Things (IoT).

The LTS in Ubuntu 16.04 LTS stands for Long Term Support. The LTS versions are released every two years and include five years of commercial support for the Ubuntu Server edition.

Red Hat Enterprise Linux

Red Hat Enterprise Linux

While Red Hat started out as the “little Linux company that could,” its Red Hat Enterprise Linux (RHEL) server operating system is now a major force in the quest for data center rackspace. The Linux darling of large companies throughout the world, Red Hat’s innovations and non-stop support, including ten years of support for major releases, will keep you coming back for more.
Red Hat
RHEL is based on the community-driven Fedora, which Red Hat sponsors. Fedora is updated more frequently than RHEL and serves as more of a bleeding-edge Linux distro in terms of features and technology, but it doesn’t offer the stability or the length and quality of commercial support that RHEL is renowned for.In development since 2010, Red Hat Enterprise Linux 7 (RHEL 7) made its official debut in June 2014, and the major update offers scalability improvements for enterprises, including a new filesystem that can scale to 500 terabytes, as well as support for Docker container virtualization technology. The most recent release of RHEL, version 7.2, arrived in November 2015.
SUSE Linux Enterprise Server

SUSE Linux Enterprise Server

The Micro Focus-owned (but independently operated) SUSE Linux Enterprise Server (SLES) is stable, easy to maintain and offers 24×7 rapid-response support for those who don’t have the time or patience for lengthy troubleshooting calls. And the SUSE consulting teams will have you meeting your SLAs and making your accountants happy to boot.
SUSE Linux
Similar to how Red Hat’s RHEL is based on the open-source Fedora distribution, SLES is based on the open-source openSUSE Linux distro, with SLES focusing on stability and support over leading-edge features and technologies.The most recent major release, SUSE Linux Enterprise Server 12 (SLES 12), debuted in late October 2014 and introduced new features like framework for Docker, full system rollback, live kernel patching enablement and software modules for “increasing data center uptime, improving operational efficiency and accelerating the adoption of open source innovation,” according to SUSE.SLES 12 SP1 (Service Pack 1) followed the initial SLES 12 release in December 2015, and added support for Docker, Network Teaming, Shibboleth and JeOS images.
CentOS

CentOS

If you operate a website through a web hosting company, there’s a very good chance your web server is powered by CentOS Linux. This low-cost clone of Red Hat Enterprise Linux isn’t strictly commercial, but since it’s based on RHEL, you can leverage commercial support for it.Short for Community Enterprise Operating System, CentOS
CentOS has largely operated as a community-driven project that used the RHEL code, removed all Red Hat’s trademarks, and made the Linux server OS available for free use and distribution.In 2014 the focus shifted following Red Hat and CentOS announcing they would collaborate going forward and that CentOS would serve to address the gap between the community-innovation-focused Fedora platform and the enterprise-grade, commercially-deployed Red Hat Enterprise Linux platform.CentOS will continue to deliver a community-oriented operating system with a mission of helping users develop and adopt open source technologies on a Linux server distribution that is more consistent and conservative than Fedora’s more innovative role.At the same time, CentOS will remain free, with support provided by the community-led CentOS project rather than through Red Hat. CentOS released CentOS 7.2 in December 2015, which is derived from Red Hat Enterprise Linux 7.2.
Debian

Debian

If you’re confused by Debian’s inclusion here, don’t be. Debian doesn’t have formal commercial support but you can connect with Debian-savvy consultants around the world via theirConsultants page. Debian originated in 1993 and has spawned more child distributions than any other parent Linux distribution, including Ubuntu, Linux Mint and Vyatta.
Debian
Debian remains a popular option for those who value stability over the latest features. The latest major stable version of Debian, Debian 8 “jessie,” was released in April 2015, and it will be supported for five years.Debian 8 marks the switch to the systemd init system over the old SysVinit init system, and includes the latest releses of the Linux Kernel, Apache, LibreOffice, Perl, Python, Xen Hypervisor, GNU Compiler Collection and the GNOME and Xfce desktop environments.The latest update for Debian 8, version 8.4, debuted on April 2nd, 2016.
Oracle Linux

Oracle Linux

If you didn’t know that Oracle produces its own Linux distribution, you’re not alone. Oracle Linux (formerly Oracle Enterprise Linux) is Red Hat Enterprise Linux fortified with Oracle’s own special Kool-Aid as well as various Oracle logos and art added in.Oracle’s Linux competes directly with Red Hat’s Linux server distributions, and does so quite effectively since purchased support through Oracle is half the price of Red Hat’s equivalent model.
 Oracle Linux Server
Optimized for Oracle’s database services, Oracle Linux is a heavy contender in the enterprise Linux market. If you run Oracle databases and want to run them on Linux, you know the drill: Call Oracle.The latest release of Oracle Linux, version 7.2, arrived in November 2015 and is based on RHEL 7.2.
Mageia / Mandriva

Mageia / Mandriva

Mageia is an open-source-based fork of Mandriva Linux that made its debut in 2011. The most recent release, Mageia 5, became available in June 2015, and Mageia 6 is expected to debut in late June 2016.
Mageia and Mandriva Linux
For U.S.-based executive or technical folks, Mageia and its predecessor Mandriva might be a bit foreign. The incredibly well-constructed Mandriva Linux distribution hails from France and enjoys extreme acceptance in Europe and South America. The Mandriva name and its construction derive from the Mandrake Linux and Connectiva Linux distributions.Mageia maintains the strengths of Mandriva while continuing its development with new features and capabilities, as well as support from the community organization Mageia.Org. Mageia updates are typically released on a 9-month release cycle, with each release supported for two cycles (18 months).As for Mandriva Linux, the Mandriva SA company continues its business Linux server projects, which are now based on Mageia code.