Category Archives: Cloud

An Introduction to Cloud Hosting

Introduction

Cloud hosting is a method of using online virtual servers that can be created, modified, and destroyed on demand. Cloud servers are allocated resources like CPU cores and memory by the physical server that it’s hosted on and can be configured with a developer’s choice of operating system and accompanying software. Cloud hosting can be used for hosting websites, sending and storing emails, and distributing web-based applications and other services.

In this guide, we will go over some of the basic concepts involved in cloud hosting, including how virtualization works, the components in a virtual environment, and comparisons with other common hosting methods.

What is “the Cloud”?

“The Cloud” is a common term that refers to servers connected to the Internet that are available for public use, either through paid leasing or as part of a software or platform service. A cloud-based service can take many forms, including web hosting, file hosting and sharing, and software distribution. “The Cloud” can also be used to refer to cloud computing, which is the practice of using several servers linked together to share the workload of a task. Instead of running a complex process on a single powerful machine, cloud computing distributes the task across many smaller computers.

Other Hosting Methods

Cloud hosting is just one of many different types of hosting available to customers and developers today, though there are some key differences between them. Traditionally, sites and apps with low budgets and low traffic would use shared hosting, while more demanding workloads would be hosted on dedicated servers.

Shared hosting is the most common and most affordable way to get a small and simple site up and running. In this scenario, hundreds or thousands of sites share a common pool of server resources, like memory and CPU. Shared hosting tends to offer the most basic and inflexible feature and pricing structures, as access to the site’s underlying software is very limited due to the shared nature of the server.

Dedicated hosting is when a physical server machine is sold or leased to a single client. This is more flexible than shared hosting, as a developer has full control over the server’s hardware, operating system, and software configuration. Dedicated servers are common among more demanding applications, such as enterprise software and commercial services like social media, online games, and development platforms.

How Virtualization Works

Cloud hosting environments are broken down into two main parts: the virtual servers that apps and websites can be hosted on and the physical hosts that manage the virtual servers. This virtualization is what is behind the features and advantages of cloud hosting: the relationship between host and virtual server provides flexibility and scaling that are not available through other hosting methods.

Virtual Servers

The most common form of cloud hosting today is the use of a virtual private server, or VPS. A VPS is a virtual server that acts like a real computer with its own operating system. While virtual servers share resources that are allocated to them by the host, their software is well isolated, so operations on one VPS won’t affect the others.

Virtual servers are deployed and managed by the hypervisor of a physical host. Each virtual server has an operating system installed by the hypervisor and available to the user to add software on top of. For many practical purposes, a virtual server is identical in use to a dedicated physical server, though performance may be lower in some cases due to the virtual server sharing physical hardware resources with other servers on the same host.

Hosts

Resources are allocated to a virtual server by the physical server that it is hosted on. This host uses a software layer called a hypervisor to deploy, manage, and grant resources to the virtual servers that are under its control. The term “hypervisor” is often used to refer to the physical hosts that hypervisors (and their virtual servers) are installed on.

The host is in charge of allocating memory, CPU cores, and a network connection to a virtual server when one is launched. An ongoing duty of the hypervisor is to schedule processes between the virtual CPU cores and the physical ones, since multiple virtual servers may be utilizing the same physical cores. The method of choice for process scheduling is one of the key differences between different hypervisors.

Hypervisors

There are a few common hypervisor software available for cloud hosts today. These different virtualization methods have some key differences, but they all provide the tools that a host needs to deploy, maintain, move, and destroy virtual servers as needed.

KVM, short for “Kernel-Based Virtual Machine”, is a virtualization infrastructure that is built in to the Linux kernel. When activated, this kernel module turns the Linux machine into a hypervisor, allowing it to begin hosting virtual servers. This method is in contrast from how other hypervisors usually work, as KVM does not need to create or emulate kernel components that are used for virtual hosting.

Xen is one of the most common hypervisors in use today. Unlike KVM, Xen uses a microkernel, which provides the tools needed to support virtual servers without modifying the host’s kernel. Xen supports two distinct methods of virtualization: paravirtualization, which skips the need to emulate hardware but requires special modifications made to the virtual servers’ operating system, and hardware-assisted virtualization, which uses special hardware features to efficiently emulate a virtual server so that they can use unmodified operating systems.

ESXi is an enterprise-level hypervisor offered by VMware. ESXi is unique in that it doesn’t require the host to have an underlying operating system. This is referred to as a “type 1” hypervisor and is extremely efficient due to the lack of a “middleman” between the hardware and the virtual servers. With type 1 hypervisors like ESXi, no operating system needs to be loaded on the host because the hypervisor itself acts as the operating system.

Hyper-V is one of the most popular methods of virtualizing Windows servers and is available as a system service in Windows Server. This makes Hyper-V a common choice for developers working within a Windows software environment. Hyper-V is included in Windows Server 2008 and 2012 and is also available as a stand-alone server without an existing installation of Windows Server.

Why Cloud Hosting?

The features offered by virtualization lend themselves well to a cloud hosting environment. Virtual servers can be configured with a wide range of hardware resource allocations, and can often have resources added or removed as needs change over time. Some cloud hosts can move a virtual server from one hypervisor to another with little or no downtime or duplicate the server for redundancy in case of a node failure.

Customization

Developers often prefer to work in a VPS due to the control that they have over the virtual environment. Most virtual servers running Linux offer access to the root (administrator) account or sudo privileges by default, giving a developer the ability to install and modify whatever software they need.

This freedom of choice begins with the operating system. Most hypervisors are capable of hosting nearly any guest operating system, from open source software like Linux and BSD to proprietary systems like Windows. From there, developers can begin installing and configuring the building blocks needed for whatever they are working on. A cloud server’s configurations might involve a web server, database, email service, or an app that has been developed and is ready for distribution.

Scalability

Cloud servers are very flexible in their ability to scale. Scaling methods fall into two broad categories: horizontal scaling and vertical scaling. Most hosting methods can scale one way or the other, but cloud hosting is unique in its ability to scale both horizontally and vertically. This is due to the virtual environment that a cloud server is built on: since its resources are an allocated portion of a larger physical pool, it’s easy to adjust these resources or duplicate the virtual image to other hypervisors.

Horizontal scaling, often referred to as “scaling out”, is the process of adding more nodes to a clustered system. This might involve adding more web servers to better manage traffic, adding new servers to a region to reduce latency, or adding more database workers to increase data transfer speed. Many newer web utilities, like CoreOS, Docker, and Couchbase, are built around efficient horizontal scaling.

Vertical scaling, or “scaling up”, is when a single server is upgraded with additional resources. This might be an expansion of available memory, an allocation of more CPU cores, or some other upgrade that increases that server’s capacity. These upgrades usually pave the way for additional software instances, like database workers, to operate on that server. Before horizontal scaling became cost-effective, vertical scaling was the method of choice to respond to increasing demand.

With cloud hosting, developers can scale depending on their application’s needs — they can scale out by deploying additional VPS nodes, scale up by upgrading existing servers, or do both when server needs have dramatically increased.

Conclusion

By now, you should have a decent understanding of how cloud hosting works, including the relationship between hypervisors and the virtual servers that they are responsible for, as well as how cloud hosting compares to other common hosting methods. With this information in mind, you can choose the best hosting for your needs.

Redis or Memcached for caching

Memcached or Redis? It’s a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step taken, and Memcached or Redis are typically the first places to turn.

These renowned cache engines share a number of similarities, but they also have important differences. Redis, the newer and more versatile of the two, is almost always the superior choice.

The similarities

Let’s start with the similarities. Both Memcached and Redis serve as in-memory, key-value data stores, although Redis is more accurately described as a data structure store. Both Memcached and Redis belong to the NoSQL family of data management solutions, and both are based on a key-value data model. They both keep all data in RAM, which of course makes them supremely useful as a caching layer. In terms of performance, the two data stores are also remarkably similar, exhibiting almost identical characteristics (and metrics) with respect to throughput and latency.

Both Memcached and Redis are mature and hugely popular open source projects. Memcached was originally developed by Brad Fitzpatrick in 2003 for the LiveJournal website. Since then, Memcached has been rewritten in C (the original implementation was in Perl) and put in the public domain, where it has become a cornerstone of modern Web applications. Current development of Memcached is focused on stability and optimizations rather than adding new features.

Redis was created by Salvatore Sanfilippo in 2009, and Sanfilippo remains the lead developer of the project today. Redis is sometimes described as “Memcached on steroids,” which is hardly surprising considering that parts of Redis were built in response to lessons learned from using Memcached. Redis has more features than Memcached and is, thus, more powerful and flexible.

Used by many companies and in countless mission-critical production environments, both Memcached and Redis are supported by client libraries in every conceivable programming language, and it’s included in a multitude of packages for developers. In fact, it’s a rare Web stack that does not include built-in support for either Memcached or Redis.

Why are Memcached and Redis so popular? Not only are they extremely effective, they’re also relatively simple. Getting started with either Memcached or Redis is considered easy work for a developer. It takes only a few minutes to set up and get them working with an application. Thus, a small investment of time and effort can have an immediate, dramatic impact on performance — usually by orders of magnitude. A simple solution with a huge benefit; that’s as close to magic as you can get.

When to use Memcached

Because Redis is newer and has more features than Memcached, Redis is almost always the better choice. However, Memcached could be preferable when caching relatively small and static data, such as HTML code fragments. Memcached’s internal memory management, while not as sophisticated as that of Redis, is more efficient in the simplest use cases because it consumes comparatively less memory resources for metadata. Strings (the only data type supported by Memcached) are ideal for storing data that’s only read, because strings require no further processing.

That said, Memcached’s memory management efficiency diminishes quickly when data size is dynamic, at which point Memcached’s memory can become fragmented. Also, large data sets often involve serialized data, which always requires more space to store. While Memcached is effectively limited to storing data in its serialized form, the data structures in Redis can store any aspect of the data natively, thus reducing serialization overhead.

The second scenario in which Memcached has an advantage over Redis is in scaling. Because Memcached is multithreaded, you can easily scale up by giving it more computational resources, but you will lose part or all of the cached data (depending on whether you use consistent hashing). Redis, which is mostly single-threaded, can scale horizontally via clustering without loss of data. Clustering is an effective scaling solution, but it is comparatively more complex to set up and operate.

When to use Redis

You’ll almost always want to use Redis because of its data structures. With Redis as a cache, you gain a lot of power (such as the ability to fine-tune cache contents and durability) and greater efficiency overall. Once you use the data structures, the efficiency boost becomes tremendous for specific application scenarios.

Redis’ superiority is evident in almost every aspect of cache management. Caches employ a mechanism called data eviction to make room for new data by deleting old data from memory. Memcached’s data eviction mechanism employs a Least Recently Used algorithm and somewhat arbitrarily evicts data that’s similar in size to the new data.

Redis, by contrast, allows for fine-grained control over eviction, letting you choose from six different eviction policies. Redis also employs more sophisticated approaches to memory management and eviction candidate selection. Redis supports both lazy and active eviction, where data is evicted only when more space is needed or proactively. Memcached, on the other hand, provides lazy eviction only.

Redis gives you much greater flexibility regarding the objects you can cache. While Memcached limits key names to 250 bytes and works with plain strings only, Redis allows key names and values to be as large as 512MB each, and they are binary safe. Plus, Redis has five primary data structures to choose from, opening up a world of possibilities to the application developer through intelligent caching and manipulation of cached data.

Beyond caching

Using Redis data structures can simplify and optimize several tasks — not only while caching, but even when you want the data to be persistent and always available. For example, instead of storing objects as serialized strings, developers can use a Redis Hash to store an object’s fields and values, and manage them using a single key. Redis Hash saves developers the need to fetch the entire string, deserialize it, update a value, reserialize the object, and replace the entire string in the cache with its new value for every trivial update — that means lower resource consumption and increased performance.

Other data structures offered by Redis (such as lists, sets, sorted sets, hyperloglogs, bitmaps, and geospatial indexes) can be used to implement even more complex scenarios. Sorted sets for time-series data ingestion and analysis is another example of a Redis data structure that offers enormously reduced complexity and lower bandwidth consumption.

Another important advantage of Redis is that the data it stores isn’t opaque, so the server can manipulate it directly. A considerable share of the 180-plus commands available in Redis are devoted to data processing operations and embedding logic in the data store itself via server-side Lua scripting. These built-in commands and user scripts give you the flexibility of handling data processing tasks directly in Redis without having to ship data across the network to another system for processing.

Redis offers optional and tunable data persistence designed to bootstrap the cache after a planned shutdown or an unplanned failure. While we tend to regard the data in caches as volatile and transient, persisting data to disk can be quite valuable in caching scenarios. Having the cache’s data available for loading immediately after restart allows for much shorter cache warm-up and removes the load involved in repopulating and recalculating cache contents from the primary data store.

Data replication too

Redis can also replicate the data that it manages. Replication can be used for implementing a highly available cache setup that can withstand failures and provide uninterrupted service to the application. A cache failure falls only slightly short of application failure in terms of the impact on user experience and application performance, so having a proven solution that guarantees the cache’s contents and service availability is a major advantage in most cases.

Last but not least, in terms of operational visibility, Redis provides a slew of metrics and a wealth of introspective commands with which to monitor and track usage and abnormal behavior. Real-time statistics about every aspect of the database, the display of all commands being executed, the listing and managing of client connections — Redis has all that and more.

When developers realize the effectiveness of Redis’ persistence and in-memory replication capabilities, they often use it as a first-responder database, usually to analyze and process high-velocity data and provide responses to the user while a secondary (often slower) database maintains a historical record of what happened. When used in this manner, Redis can also be ideal for analytics use cases.

Redis for analytics

Three analytics scenarios come immediately to mind. In the first scenario, when using something like Apache Spark to iteratively process large data sets, you can use Redis as a serving layer for data previously calculated by Spark. In the second scenario, using Redis as your shared, in-memory, distributed data store canaccelerate Spark processing speeds by a factor of 45 to 100. Finally, an all too common scenario is one in which reports and analytics need to be customizable by the user, but retrieving data from inherently batch data stores (like Hadoop or an RDBMS) takes too long. In this case, an in-memory data structure store such as Redis is the only practical way of getting submillisecond paging and response times.

When using extremely large operational data sets or analytics workloads, running everything in-memory might not be cost effective. To achieve submillisecond performance at lower cost, Redis Labs created a version of Redis that runs on a combination of RAM and flash, with the option to configure RAM-to-flash ratios. While this opens up several new avenues to accelerate workload processing, it also gives developers the option to simply run their “cache on flash.”

Open source software continues to provide some of the best technologies available today. When it comes to boosting application performance through caching, Redis and Memcached are the most established and production-proven candidates. However, given its richer functionality, more advanced design, many potential uses, and greater cost efficiency at scale, Redis should be your first choice in nearly every case.

An Introduction to Securing your Linux VPS

Introduction

Taking control of your own Linux server is an opportunity to try new things and leverage the power and flexibility of a great platform. However, Linux server administrators must take the same caution that is appropriate with any network-connected machine to keep it secure and safe.

There are many different security topics that fall under the general category of “Linux security” and many opinions as to what an appropriate level of security looks like for a Linux server.

The main thing to take away from this is that you will have to decide for yourself what security protections will be necessary. Before you do this, you should be aware of the risks and the trade offs, and decide on the balance between usability and security that makes sense for you.

This article is meant to help orient you with some of the most common security measures to take in a Linux server environment. This is not an exhaustive list, and does not cover recommended configurations, but it will provide links to more thorough resources and discuss why each component is an important part of many systems.

Blocking Access with Firewalls

One of the easiest steps to recommend to all users is to enable and configure a firewall. Firewalls act as a barrier between the general traffic of the internet and your machine. They look at traffic headed in and out of your server, and decide if it should allow the information to be delivered.

They do this by checking the traffic in question against a set of rules that are configured by the user. Usually, a server will only be using a few specific networking ports for legitimate services. The rest of the ports are unused, and should be safely protected behind a firewall, which will deny all traffic destined for these locations.

This allows you to drop data that you are not expecting and even conditionalize the usage of your real services in some cases. Sane firewall rules provide a good foundation to network security.

There are quite a few firewall solutions available. We’ll briefly discuss some of the more popular options below.

UFW

UFW stands for uncomplicated firewall. Its goal is to provide good protection without the complicated syntax of other solutions.

UFW, as well as most Linux firewalls, is actually a front-end to control the netfilter firewall included with the Linux kernel. This is usually a simple firewall to use for people not already familiar with Linux firewall solutions and is generally a good choice.

You can learn how to enable and configure the UFW firewall and find out more by clicking this link.

IPTables

Perhaps the most well-known Linux firewall solution is iptables. IPTables is another component used to administer the netfilter firewall included in the Linux kernel. It has been around for a long time and has undergone intense security audits to ensure its safety. There is a version of iptables called ip6tables for creating IPv6 restrictions.

You will likely come across iptables configurations during your time administering Linux machines. The syntax can be complicated to grasp at first, but it is an incredibly powerful tool that can be configured with very flexible rule sets.

You can learn more about how to implement some iptables firewall rules on Ubuntu or Debian systemshere, or learn how to use iptables on CentOS/Fedora/RHEL-based distros here.

IP6Tables

As mentioned above, the iptables is used to manipulate the tables that contain IPv4 rules. If you have IPv6 enabled on your server, you will need to also pay attention to the IPv6 equivalent: ip6tables.

The netfilter firewall that is included in the Linux kernel keeps IPv4 and IPv6 traffic completely separate. These are stored in different tables. The rules that dictate the ultimate fate of a packet are determined by the protocol version that is being used.

What this means to the server’s administer is that a separate ruleset must be maintained when version 6 is enabled. The ip6tables command shares the same syntax as the iptables command, so implementing the same set of restrictions in the version 6 table is usually straight forward. You must be sure match traffic directed at your IPv6 addresses however, for this to work correctly.

NFTables

Although iptables has long been the standard for firewalls in a Linux environment, a new firewall called nftables has recently been added into the Linux kernel. This is a project by the same team that makes iptables, and is intended to eventually replace iptables.

The nftables firewall attempts to implement more readable syntax than that found its iptables predecessor, and implements IPv4 and IPv6 support into the same tool. While most versions of Linux at this time do not ship with a kernel new enough to implement nftables, it will soon be very commonplace, and you should try to familiarize yourself with its usage.

Using SSH to Securely Login Remotely

When administering a server where you do not have local access, you will need to log in remotely. The standard, secure way of accomplishing this on a Linux system is through a protocol known called SSH, which stands for secure shell.

SSH provides end-to-end encryption, the ability to tunnel insecure traffic over a secure connection, X-forwarding (graphical user interface over a network connection), and much more. Basically, if you do not have access to a local connection or out-of-band management, SSH should be your primary way of interacting with your machine.

While the protocol itself is very secure and has undergone extensive research and code review, your configuration choices can either aid or hinder the security of the service. We will discuss some options below.

Password vs SSH-Key Logins

SSH has a flexible authentication model that allows you to sign in using a number of different methods. The two most popular choices are password and SSH-key authentication.

While password authentication is probably the most natural model for most users, it is also the less secure of these two choices. Password logins allow a potential intruder to continuously guess passwords until a successful combination is found. This is known as brute-forcing and can easily be automated by would-be attackers with modern tools.

SSH-keys, on the other hand, operate by generating a secure key pair. A public key is created as a type of test to identify a user. It can be shared publicly without issues, and cannot be used for anything other than identifying a user and allowing a login to the user with the matching private key. The private key should be kept secret and is used to pass the test of its associated public key.

Basically, you can add your public SSH key on a server, and it will allow you to login by using the matching private key. These keys are so complex that brute-forcing is not practical. Furthermore, you can optionally add long passphrases to your key that adds even more security.

To learn more about how to use SSH click here, and check out this link to learn how to set up SSH keys on your server.

Implement fail2ban to Ban Malicious IP Addresses

One step that will help with the general security of your SSH configuration is to implement a solution like fail2ban. Fail2ban is a service that monitors log files in order to determine if a remote system is likely not a legitimate user, and then temporarily ban future traffic from the associated IP address.

Setting up a sane fail2ban policy can allow you to flag computers that are continuously trying to log in unsuccessfully and add firewall rules to drop traffic from them for a set period of time. This is an easy way of hindering often used brute force methods because they will have to take a break for quite a while when banned. This usually is enough to discourage further brute force attempts.

You can learn how to implement a fail2ban policy on Ubuntu here. There are similar guides for Debian andCentOS here.

Implement an Intrusion Detection System to Detect Unauthorized Entry

One important consideration to keep in mind is developing a strategy for detecting unauthorized usage. You may have preventative measures in place, but you also need to know if they’ve failed or not.

An intrusion detection system, also known as an IDS, catalogs configuration and file details when in a known-good state. It then runs comparisons against these recorded states to find out if files have been changed or settings have been modified.

There are quite a few intrusion detection systems. We’ll go over a few below.

Tripwire

One of the most well-known IDS implementations is tripwire. Tripwire compiles a database of system files and protects its configuration files and binaries with a set of keys. After configuration details are chosen and exceptions are defined, subsequent runs notify of any alterations to the files that it monitors.

The policy model is very flexible, allowing you to shape its properties to your environment. You can then configure tripwire runs via a cron job and even implement email notifications in the event of unusual activity.

Learn more about how to implement tripwire here.

Aide

Another option for an IDS is Aide. Similar to tripwire, Aide operates by building a database and comparing the current system state to the known-good values it has stored. When a discrepancy arises, it can notify the administrator of the problem.

Aide and tripwire both offer similar solutions to the same problem. Check out the documentation and try out both solutions to find out which you like better.

For a guide on how to use Aide as an IDS, check here.

Psad

The psad tool is concerned with a different portion of the system than the tools listed above. Instead of monitoring system files, psad keeps an eye on the firewall logs to try to detect malicious activity.

If a user is trying to probe for vulnerabilities with a port scan, for instance, psad can detect this activity and dynamically alter the firewall rules to lock out the offending user. This tool can register different threat levels and base its response on the severity of the problem. It can also optionally email the administrator.

To learn how to use psad as a network IDS, follow this link.

Bro

Another option for a network-based IDS is Bro. Bro is actually a network monitoring framework that can be used as a network IDS or for other purposes like collecting usage stats, investigating problems, or detecting patterns.

The Bro system is divided into two layers. The first layer monitors activity and generates what it considers events. The second layer runs the generated events through a policy framework that dictates what should be done, if anything, with the traffic. It can generate alerts, execute system commands, simply log the occurrence, or take other paths.

To find out how to use Bro as an IDS, click here.

RKHunter

While not technically an intrusion detection system, rkhunter operates on many of the same principles as host-based intrusion detection systems in order to detect rootkits and known malware.

While viruses are rare in the Linux world, malware and rootkits are around that can compromise your box or allow continued access to a successful exploiter. RKHunter downloads a list of known exploits and then checks your system against the database. It also alerts you if it detects unsafe settings in some common applications.

You can check out this article to learn how to use RKHunter on Ubuntu.

General Security Advice

While the above tools and configurations can help you secure portions of your system, good security does not come from just implementing a tool and forgetting about it. Good security manifests itself in a certain mindset and is achieved through diligence, scrutiny, and engaging in security as a process.

There are some general rules that can help set you in the right direction in regards to using your system securely.

Pay Attention to Updates and Update Regularly

Software vulnerabilities are found all of the time in just about every kind of software that you might have on your system. Distribution maintainers generally do a good job of keeping up with the latest security patches and pushing those updates into their repositories.

However, having security updates available in the repository does your server no good if you have not downloaded and installed the updates. Although many servers benefit from relying on stable, well-tested versions of system software, security patches should not be put off and should be considered critical updates.

Most distributions provide security mailing lists and separate security repositories to only download and install security patches.

Take Care When Downloading Software Outside of Official Channels

Most users will stick with the software available from the official repositories for their distribution, and most distributions offer signed packages. Users generally can trust the distribution maintainers and focus their concern on the security of software acquired outside of official channels.

You may choose to trust packages from your distribution or software that is available from a project’s official website, but be aware that unless you are auditing each piece of software yourself, there is risk involved. Most users feel that this is an acceptable level of risk.

On the other hand, software acquired from random repositories and PPAs that are maintained by people or organizations that you don’t recognize can be a huge security risk. There are no set rules, and the majority of unofficial software sources will likely be completely safe, but be aware that you are taking a risk whenever you trust another party.

Make sure you can explain to yourself why you trust the source. If you cannot do this, consider weighing your security risk as more of a concern than the convenience you’ll gain.

Know your Services and Limit Them

Although the entire point of running a server is likely to provide services that you can access, limit the services running on your machine to those that you use and need. Consider every enabled service to be a possible threat vector and try to eliminate as many threat vectors as you can without affecting your core functionality.

This means that if you are running a headless (no monitor attached) server and don’t run any graphical (non-web) programs, you should disable and probably uninstall your X display server. Similar measures can be taken in other areas. No printer? Disable the “lp” service. No Windows network shares? Disable the “samba” service.

You can discover which services you have running on your computer through a variety of means. This article covers how to detect enabled services under the “create a list of requirements” section.

Do Not Use FTP; Use SFTP Instead

This might be a hard one for many people to come to terms with, but FTP is a protocol that is inherently insecure. All authentication is sent in plain-text, meaning that anyone monitoring the connection between your server and your local machine can see your login details.

There are only very few instances where FTP is probably okay to implement. If you are running an anonymous, public, read-only download mirror, FTP is a decent choice. Another case where FTP is an okay choice is when you are simply transferring files between two computers that are behind a NAT-enabled firewall, and you trust your network is secure.

In almost all other cases, you should use a more secure alternative. The SSH suite comes complete with an alternative protocol called SFTP that operates on the surface in a similar way, but it based on the same security of the SSH protocol.

This allows you to transfer information to and from your server in the same way that you would traditionally use FTP, but without the risk. Most modern FTP clients can also communicate with SFTP servers.

To learn how to use SFTP to transfer files securely, check out this guide.

Implement Sensible User Security Policies

There are a number of steps that you can take to better secure your system when administering users.

One suggestion is to disable root logins. Since the root user is present on any POSIX-like systems and it is an all-powerful account, it is an attractive target for many attackers. Disabling root logins is often a good idea after you have configured sudo access, or if you are comfortable using the su command. Many people disagree with this suggestion, but examine if it is right for you.

It is possible to disable remote root logins within the SSH daemon or to disable local logins, you can make restrictions in the /etc/securetty file. You can also set the root user’s shell to a non-shell to disable root shell access and set up PAM rules to restrict root logins as well. RedHat has a great article on how to disable root logins.

Another good policy to implement with user accounts is creating unique accounts for each user and service, and give them only the bare minimum permissions to get their job done. Lock down everything that they don’t need access to and take away all privileges short of crippling them.

This is an important policy because if one user or service gets compromised, it doesn’t lead to a domino affect that allows the attacker to gain access to even more of the system. This system of compartmentalization helps you to isolate problems, much like a system of bulkheads and watertight doors can help prevent a ship from sinking when there is a hull breach.

In a similar vein to the services policies we discussed above, you should also take care to disable any user accounts that are no longer necessary. This may happen when you uninstall software, or if a user should no longer have access to the system.

Pay Attention to Permission Settings

File permissions are a huge source of frustration for many users. Finding a balance for permissions that allow you to do what you need to do while not exposing yourself to harm can be difficult and demands careful attention and thought in each scenario.

Setting up a sane umask policy (the property that defines default permissions for new files and directories) can go a long way in creating good defaults. You can learn about how permissions work and how to adjust your umask value here.

In general, you should think twice before setting anything to be world-writeable, especially if it is accessible in any way to the internet. This can have extreme consequences. Additionally, you should not set the SGID or SUID bit in permissions unless you absolutely know what you are doing. Also, check that your files have an owner and a group.

Your file permissions settings will vary greatly based on your specific usage, but you should always try to see if there is a way to get by with fewer permissions. This is one of the easiest things to get wrong and an area where there is a lot of bad advice floating around on the internet.

Regularly Check for Malware on your Servers

While Linux is generally less targeted by Malware than Windows, it is by no means immune to malicious software. In conjunction with implementing an IDS to detect intrusion attempts, scanning for malware can help identify traces of activity that indicate that illegitimate software is installed on your machine.

There are a number of malware scanners available for Linux systems that can be used to regularly validate the integrity of your servers. Linux Malware Detect, also known as maldet or LMD, is one popular option that can be easily installed and configured to scan for known malware signatures. It can be run manually to perform one-off scans and can also be daemonized to run regularly scheduled scans. Reports from these scans can be emailed to the server administrators.

How To Secure the Specific Software you are Using

Although this guide is not large enough to go through the specifics of securing every kind of service or application, there are many tutorials and guidelines available online. You should read the security recommendations of every project that you intend to implement on your system.

Furthermore, popular server software like web servers or database management systems have entire websites and databases devoted to security. In general, you should read up on and secure every service before putting it online.

You can check our security section for more specific advice for the software you are using.

Conclusion

You should now have a decent understanding of general security practices you can implement on your Linux server. While we’ve tried hard to mention many areas of high importance, at the end of the day, you will have to make many decisions on your own. When you administer a server, you have to take responsibility for your server’s security.

This is not something that you can configure in one quick spree in the beginning, it is a process and an ongoing exercise in auditing your system, implementing solutions, evaluating logs and alerts, reassessing your needs, etc. You need to be vigilant in protecting your system and always be evaluating and monitoring the results of your solutions.

How To Set Up a Host Name with DigitalOcean

Setup

Before you get started, you do need to have the following:

  • A Droplet (virtual private server) from DigitalOcean. If you don’t have one, you can register and set one up in under a minute
  • A Registered Domain Name. As of yet, you cannot register a domain through DigitalOcean.

Step One—Look Up Information with WHOIS

The first thing you need to do to set up your host name is to change your domain name server to point to the DigitalOcean name servers. You can do this through your domain registrar’s website. If you do not remember where you registered your name, you can look it up using “WHOIS”, a protocol that displays a site’s identifying information, such as the IP address and registration details.

Open up the command line and type:

whois example.com

WHOIS will display all of the details associated with the site, includng the Technical Contact which includes your domain registrar.

Step Two—Change Your Domain Server

Access the control panel of your domain registrar and find the fields called “Domain Name Server.” The forms for my domain registrar looked like this

Point your name servers to DigitalOcean and fill in three Domain Name Server fields. Once done, save your changes and exit.

The DigitalOcean domain servers are

  • ns1.digitalocean.com
  • ns2.digitalocean.com
  • ns3.digitalocean.com

You can verify that the new name servers are registered by running WHOIS again; the output should include the updated information:

Domain Name: EXAMPLE.COM
   Registrar: ENOM, INC.
   Whois Server: whois.enom.com
   Referral URL: http://www.enom.com
   Name Server: NS1.DIGITALOCEAN.COM
   Name Server: NS2.DIGITALOCEAN.COM
   Name Server: NS3.DIGITALOCEAN.COM
   Status: ok

Although the name servers are visible through WHOIS, it may take an hour or two for the changes to be reflected on your site.

Step Three—Configure your Domain

Now we need move into the DigitalOcean control panel.

Within the Networking section, click on Add Domain, and fill in the the domain name field and IP address of the server you want to connect it to on the subsequent page. Note: The domain name does not have a www at the beginning.

add a domain

You will reach a page where you can enter all of your site details. To make a new hostname, you only need to fill in the A record. If you are using an IPv6 address, you should enter it into the AAAA record.

A Records: Use this space to enter in the IP address of the droplet that you want to host your domain name on and the host name itself, a name prepended to your domain name. For example:

test.example.com

To accomplish this, create a new hostname with the word “test” in the hostname field. Your screen should look like this:

domain name

Save by clicking “Add new A record”

You can also connect your IP to a domain name with nothing before it (this should also occur by default when you add a domain):

http://example.com

To accomplish this, create a new hostname with the symbol “@’ in the hostname field. Your screen should look like this:

domain name

You can save by pressing enter after making the required changes on the line.

AAAA Records: Use this space to enter in the IPv6 address of the droplet that you want to host your domain name on and the host name itself, a name prepended to your domain name or you can also connect your IP to a domain name with nothing before it. To accomplish this, create a new hostname with the symbol “@’ in the hostname field. Your screen should look like this:

For example:

domain name

Save by clicking “Create”

CNAME Records: The CNAME record works as an alias of the A Record, pointing a subdomain to an A record— if an A Record’s IP address changes, the CNAME will follow to the new address. To prepend www to your URL, click on “Add a new CNAME record” and fill out the 2 fields.

Your screen should look like this:

 CNAME records

You can also set up a catchall or wildcard CNAME record that will direct any subdomain to the designated A record (for example, if a visitor accidentally types in wwww instead of www). This can be accomplish with an asterisk in the CNAME name field.

Your screen should look like this:

catch all CNAME records

If you need to set up a mail server on your domain, you can do so in the MX Records.

MX Records: The MX Records fields should be filled in with the hostname and priority of your mail server, a value designating the order in which the mail servers should be attempted to be reached. Records always end with a “.”A generic MX record looks something like this: mail1.example.com.

Below is an example of MX records set up for a domain that uses google mail servers (note the period at the end of each record):

Google MX records

Finish Up

Once you have filled in all of the required fields, your information will take a while to propagate, and the Name Server information will be automatically filled in. Your domain name name should be up and supported in a few hours.

You can confirm, after some time has passed, that the new host name has been registered by pinging it:

ping test.example.com

You should see something like:

# ping test.example.com
PING test.example.com (12.34.56.789) 56(84) bytes of data.
64 bytes from 12.34.56.789: icmp_req=1 ttl=63 time=1.47 ms
64 bytes from 12.34.56.789: icmp_req=2 ttl=63 time=0.674 ms

You should also be able to access the site in the browser.