3 great foldable drones, and what real users and critics say about them

In a nutshell: Foldable drones are a nice – and increasingly popular – variation of the genre, giving you a flying camera that’s easily portable.

Here are three great options you should consider – in the absence of the recalled GoPro Karma.

Zerotech Dobby

  • Small enough to use indoors
  • Top speed: 36 kph
  • Flight time: 9 minutes
3 great foldable drones: meta-review

Photo credit: Zerotech.

With its four arms tucked away, the Zerotech Dobby fits in the palm of your hand. Weighing just 199g, it’s more serious than a toy chopper but not as substantial – or pricey – as DJI’s drones.

Despite its size, it still packs a 4k camera, 1080p HD stabilized video, and a few other goodies like facial recognition so that it can follow you around and serve as a selfie drone.

What reviews say

Quadcopter Flyers:

As evident by now, Dobby is a selfie drone and does not come with any kind of remote controller/transmitter and is purely operated with the app – the control distance is 100 meters in open area with no obstacles, when the drone goes beyond the operating frequency, it auto returns itself.


The front of the Dobby Drone has a single camera that can be set into one of four positions before flying. The compact design means a motorized camera [on a gimbal] isn’t viable, but the small size of the drone means you can get in tighter and closer than on a larger UAV.


Sure, Parrot’s family of MiniDrones start at around $150, but the cameras on those are almost unusable. Video quality on the Dobby isn’t exactly amazing either, but it’s far from terrible. I’d say it’s a step or two down from the quality you’d get on a flagship smartphone these days, but more than adequate for social media. However, I found images were occasionally a bit soft in terms of focus. Depending on the light that you’re shooting in, you might see exposure change mid-flight, which can be a bit jarring.

As fun as Dobby is to play with, it’s let down by a comparatively short battery life; each battery will only give you about nine minutes per charge. This is cut shorter due to the fact the drone automatically lands itself when you’ve got about 15 percent charge left, presumably to prevent it from suddenly falling out of the sky. In some cases, we only got about five minutes of flight time before we had to stop flying and swap battery.

What users say

With 15 reviews on Zerotech’s Amazon page, this drone now has an average rating of 4.5 out of 5.

D. Davis gives it a full score, saying:

Perfect, stable, and convenient. I’m really pleased with this drone. It feels like everything has been thought out. The GPS and other ways that the drone stays locked in place are very good. Best of all, it fits in my pocket easily and I can use my cellphone (always on me anyway) to control it.

An unnamed customer adds:

Done a lot of research and compared it with Yuneec, Hover, and other small camera drones, but I finally decided to buy Dobby. (Mavic is great but it is kind of for professional users.) Pocket-sized and GPS positioning for outdoor flying are the two things attract me most.

It’s really good for family aerial selfies. The auto pull away short video is kind of magic especially on sunny days.

Evan G, meanwhile, laments the sizable price tag:

$400 for a drone this small? Tsk tsk. I was hoping to see this maybe in the $100 to $150 range, but $400? That’s the price of a brand new PS4 on release day.

Yuneec Breeze

  • Indoor positioning system helps it fly inside where GPS might be limited
  • Comes with two batteries
  • Flight time: 12 minutes
3 great foldable drones: meta-review

Photo credit: Yuneec.

Yuneec is making a name for itself with a range of drones that perform well but cost less than half of DJI’s.

The Yuneec Breeze, much bigger than the Zerotech Dobby, weighs around 400g and isn’t so foldable – only the propellers tuck in. So you won’t be pocketing this thing. But it does boast 4K videos.

Despite the size difference, its purpose is the same as the Dobby, and many of the software features are similar.

What reviews say


The mobile app, available for iOS and Android, is split into two sections: Tasks and Gallery. Tap on Tasks and you’re given five options to choose from: Pilot, Selfie, Orbit, Journey, and Follow Me. Pilot has the manual controls for flying around the way any other drone would with a regular controller.

Digital Trends:

After crashing it a half dozen times, we were pleased to discover that it’s actually pretty damn durable. We bashed it (inadvertently) into all manner of obstacles — bushes, branches, tree trunks, and even the side of a car — but in none of those instances did the drone become so damaged that it couldn’t start right up again.

As far as we can tell, the key to the Breeze’s resilience is its clever hinged props. it appears that Yuneec outfitted the drone with this feature so that the props could be folded inward for easy transport, but based on our observations, it seems that they also help the drone recover from run-ins with tree branches quite effectively. Instead of breaking on contact, they pivot backward ever so slightly, which seems to prevent them from snapping.


I did find a tendency for it to drift around disconcertingly on occasion, forcing me at one point to swiftly hit the land button before a nearby rosebush got an unscheduled trimming. And, as with most drones, GPS means you can simply tap a “return-to-home” button when you want to bring the Breeze automatically back to its take-off point.

What users say

With 59 reviews, this foldable drone scores 3.9 out of 5.

JimFeet says that gusty winds approaching 10mph [16 kph] can cause issues, but praises the Breeze.

I’ve found the Yuneec Breeze to be an excellent product in nearly all regards. While it may not be the most advanced, fastest or highest resolution camera, it flies well, is stable and easy to control with an iOS device. As a heavy Canon user with a 51 mpx DSLR, in my opinion the photos are of excellent quality. I have played with the video and found it to be excellent as well but video is not my forte so I may not be the best judge. The Breeze folds up and fits into its hard plastic case, making it easy to transport even in a hiking day pack.

Shannon S finds it to be a “good product with some noticeable benefits and drawbacks.”

The video camera is not great in low light. Without a gimbal there’s just too much blur. There is slight blur when panning on video even in daylight. This thing is surprisingly stable in the air, so one thing you obviously want to do is send it up high and take a panoramic video of your surroundings.

Creativety found the indoor flying an eye-opener:

I can easily use it indoors, which, initially scared me, but sure enough it was much easier than I imagined.

DJI Mavic Pro

  • Stealthy in black
  • Top speed: 64 kph
  • Flight time: 27 minutes
3 great foldable drones: meta-review

Photo credit: DJI.

Like with the MackBook Pro, the “pro” in the name means this is serious business – and entails an astonishing price tag.

Still, it’s essentially as easy to fly as the others on this list.

It’s no flyweight at 720g, but it’s nimbler than DJI’s other offerings – and remember it’s foldable. The range on this thing is amazing – up to 7km, if you dare go that far with the battery life.

What reviews say


Assuming you’re using the controller and not just the smartphone, you’ll do most of your flying with the control sticks, while you’ll manipulate the drone’s more advanced settings through your phone. The Mavic Pro flies smoothly and is pleasantly easy to maneuver. When you take your fingers off the sticks, it hovers steadily in place (we tested it on a day with almost no wind, it’s unclear how the Mavic might do on a blustery day). That’s especially helpful for lining up precise shots.

The Verge:

The camera and gimbal are very similar to what you find on the Phantom, only smaller. The camera uses the same sensor, shooting 4K video and 12 megapixel stills. The only difference is that the Mavic Pro doesn’t have as wide a field of view as the Phantom. The Mavic Pro does have the same forward-facing optical sensors as the Phantom 4, though, allowing it to detect obstacles and autonomously avoid crashes.


While it might not have the power to cut through really strong winds (DJI says it can handle winds up to 19-24 mph or 29-38 kph), it was able to keep the camera stable and fly steady in 10-15 mph winds and still get between 22-25 minutes of flight time before it landed itself. It does warn you when the winds are too strong for its motors, too.

What users say

The newest of this trio, it scores 4.1 out of 5 from 13 reviews.

Squatch LOVES Milo enthuses that it “does everything the Phantom 4 does but in a much smaller package.”

I pulled the Mavic out of the box and immediately realized how little this thing is compared to my Phantoms. Its size is definitely going to be the draw for most people (it’s small enough to fit in a shoulder camera bag, making it way more portable and comfortable to tote around than my Phantom 4). The four arms are all folded up into the body, making it about the size of a water bottle (slightly bigger) in its portable state.

Good Amazon Customer, an experienced RC hobbyist and drone pilot, is a convert.

I took a couple flights in high winds – 18 mph [29 kph] – and it worked perfectly. The camera is tiny, but of very high quality, probably equal to or better than most drones on the market (I think the Phantom 4 cam may be slightly better though).

That buyer adds some advice for newbies:

Word of warning though, these are not for beginners. No expensive camera drone is. Do your homework and spend time learning on smaller drones. Maybe buy a lower-priced camera drone and learn the ropes. Then get a Mavic Pro.

A Database of Words

If you write code that deals with natural language, then at some point, you will need to use data from a dictionary. You have to make a choice at this point.

  • You can either choose one the big names e.g. Oxford, Merriam-Webster, Macmillan etc. and use their API for getting the data
  • Or you can choose WordNet.

I have tried both and find WordNet to be the best tool for the job.

For those who don’t know, WordNet is a machine readable database of words which can be accessed from most popular programming languages (C, C#, Java, Ruby, Python etc.). I have several reasons for preferring WordNet over the other options.

  • Many of the big company APIs require payment. WordNet is free.
  • Many of the big company APIs are online only. WordNet can be downloaded and used offline.
  • WordNet is many times more powerful that any other dictionary or thesaurus out there.

The last point requires some explanation.

WordNet is not like your everyday dictionary. While a traditional dictionary features a list of words and their definitions, WordNet focuses on the relationship between words (in addition to definitions). The focus on relationships makes WordNet a network instead of a list. You might have guessed this already from the name WordNet.

WordNet is a network of words!

In the WordNet network, the words are connected by linguistic relations. These linguistic relations (hypernym, hyponym, meronym, pertainym and other fancy sounding stuff), are WordNet’s secret sauce. They give you powerful capabilities that are missing in ordinary dictionaries/thesauri.

We will not go deep into linguistics in this article because that is besides the point. But I do want to show you what you can achieve in your code using WordNet. So let’s look at the two most common use cases (which any dictionary or thesaurus should be able to do) and some advanced use cases (which only WordNet can do) with example code.

Common use cases

Word lookup

Let’s start with the simplest use case i.e word lookups. We can look up the meaning of the any word in WordNet in three lines of code (examples are in Python).

### checking the definition of the word "hacker"
# import the NLTK wordnet interface
>>> from nltk.corpus import wordnet as wn
# lookup the word
>>> hacker = wn.synset(“hacker.n.03”)
>>> print(hacker.definition())
a programmer for whom computing is its own reward; 
may enjoy the challenge of breaking into other 
computers but does no harm

Synonym and Antonym lookup

WordNet can function as a thesaurus too, making it easy to find synonyms and antonyms. To get the synonyms of the word beloved, for instance, I can type the following line in Python…

>>> wn.synset(“beloved.n.01”).lemmas()
[Lemma(‘beloved.n.01.beloved’), Lemma(‘beloved.n.01.dear’), 
Lemma(‘beloved.n.01.dearest’), Lemma(‘beloved.n.01.honey’),

… and get the synonyms dear, dearest, honey and love, as expected. Antonyms can be obtained just as simply.

Advanced use cases

Cross Part of Speech lookup

WordNet can do things that dictionaries/thesauri can’t. For example, WordNet knows about cross Part of Speech relations. This kind of relation connects a noun (e.g. president) with its derived verb (preside), derived adjective (presidential) and derived adverb (presidentially). The following snippet displays this functionality of WordNet (using a WordNet based Python package called word_forms).

### Generate all possible forms of the word "president"
>>> from word_forms.word_forms import get_word_forms
>>> get_word_forms(“president”)
{’n’: {‘president’, ‘Presidents’, ‘President’, ‘presidentship’,        # nouns
       ‘presidencies’, ‘presidency’, ‘presidentships’, ‘presidents’}, 
 ‘r’: {‘presidentially’},                                              # adverb
 ‘a’: {‘presidential’},                                                # adjective
 ‘v’: {‘presiding’, ‘presides’, ‘preside’, ‘presided’}                 # verbs

Being able to generate these relations is particularly useful for Natural Language Processing and for English learners.

Classification lookup

In addition to being a dictionary and thesaurus, WordNet is also a taxonomical classification system. For instance, WordNet classifies dog as a domestic animal, a domestic animal as an animal, and an animal as an organism. All words in WordNet have been similarly classified, in a way that reminds me of taxonomical classifications in biology.

The following snippet shows what happens if we follow this chain of relationships till the very end.

### follow hypernym relationship recursively till the end
# define a function that prints the next hypernym
# recursively till it reaches the end
>>> def get_parent_classes(synset):
…     while True:
…       try:
…         synset = synset.hypernyms()[-1]
…         print(synset)
…       except IndexError:
…         break 
# find the hypernyms of the word "dog"
>>> dog = wn.synset(“dog.n.01”)
>>> get_parent_classes(dog)
Synset(‘domestic_animal.n.01’) # dog is a domestic animal
Synset(‘animal.n.01’)          # a domestic animal is an animal
Synset(‘organism.n.01’)        # an animal is an organism
Synset(‘living_thing.n.01’)    # an organism is a living thing
Synset(‘whole.n.02’)           # a living thing is a whole 
Synset(‘object.n.01’)          # a whole is an object
Synset(‘physical_entity.n.01’) # an object is a physical entity
Synset(‘entity.n.01’)          # a physical entity is an entity

To visualize the classification model, it is helpful to look at the following picture, which shows a small part of WordNet.

Image courtesy the original WordNet paper.

Semantic word similarity

The classification model of WordNet have been used for many useful applications. One such application computes the similarity between two words based on the distance between words in the WordNet network. The smaller the distance, the more similar the words. In this way, it is possible to quantitatively figure out that a cat and a dog are similar, a phone and a computer are similar, but a cat and a phone are not similar!

### Checking similarity between the words "dog", "cat", "phone" and "computer"
>>> dog = wn.synset(‘dog.n.01’)
>>> cat = wn.synset(‘cat.n.01’)
>>> computer = wn.synset(‘computer.n.01’)
>>> phone = wn.synset(“phone.n.01”)
>>> wn.path_similarity(dog, cat)          # a higher score indicates more similar
>>> wn.path_similarity(phone, computer)
>>> wn.path_similarity(phone, cat)        # a lower score indicates less similar

WordNet has comprehensive coverage of the English language. Currently, it has 155,287 English words. The complete Oxford English Dictionary has nearly the same number of modern words (171,476). WordNet was last updated in 2011. Some contemporary English words like bromance or chillax seems to be missing it in for this reason, but this should not be a deal breaker for most of us.

If you want to know more about WordNet, the following references are very helpful.

This article is taken from Dibya Chakravorty of  Medium Corporation Broken Window Blog

Two D Rescue: Save and Recover Data From Crashed Disks

If we are using Linux and we need to recover data due to any of the reason whether physical damage or logical damage, we have many tools for this purpose of recovering data. To not to confuse with many, I will be discussing  only one of the data recovery tools available for Linux. ….GNU ddrescue.

GNU ddrescue is a program that copies data from one file or block device (hard disk, cd/dvd-rom, etc) to another, it is a tool to help you to save data from crashed partition i.e. it is a data recovery tool. It tries to read and if it fails it will go on with the next sectors, where tools like dd will fail. If the copying process is interrupted by the user it is possible to continue at any position later. It can copy backwards.

This program is useful to rescue data in case of I/O errors, because it does not necessarily abort or truncate the output. This is why you need to use this program and not the dd command. I have recovered much data from many disks (CD/hard disk/software raid) over the years using GNU ddrescue on Linux. I highly recommend this tool to Linux sysadmins.

Install ddrescue on a Debian/Ubuntu Linux

Type the following apt-get command to install ddrescue:
# apt-get install gddrescue
Sample outputs:

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 49.6 kB of archives.
After this operation, 152 kB of additional disk space will be used.
Get:1 http://mirrors.service.networklayer.com/ubuntu/ precise/universe gddrescue amd64 1.14-1 [49.6 kB]
Fetched 49.6 kB in 0s (1,952 kB/s)
Selecting previously unselected package gddrescue.
(Reading database ... 114969 files and directories currently installed.)
Unpacking gddrescue (from .../gddrescue_1.14-1_amd64.deb) ...
Processing triggers for install-info ...
Processing triggers for man-db ...
Setting up gddrescue (1.14-1) ...

Install ddrescue on a RHEL/Fedora/CentOS Linux

First turn on EPEL repo on a RHEL/CentOS/Fedora Linux. Type the following yum command:
# yum install ddrescue
Sample outputs:

Loaded plugins: product-id, rhnplugin, security, subscription-manager,
              : versionlock
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
This system is receiving updates from RHN Classic or RHN Satellite.
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package ddrescue.x86_64 0:1.16-1.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
 Package            Arch             Version               Repository      Size
 ddrescue           x86_64           1.16-1.el6            epel            81 k
Transaction Summary
Install       1 Package(s)
Total download size: 81 k
Installed size: 189 k
Is this ok [y/N]: y
Downloading Packages:
ddrescue-1.16-1.el6.x86_64.rpm                           |  81 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : ddrescue-1.16-1.el6.x86_64                                   1/1
  Verifying  : ddrescue-1.16-1.el6.x86_64                                   1/1
  ddrescue.x86_64 0:1.16-1.el6

You can directly download ddrescue from the official GNU project web site and compile it on Linux or Unix-like systems.

A note about using ddrescue safely

  1. You need to use a logfile to resume a rescue.
  2. Never ever run ddrescue on a read/write mounted partition.
  3. Do not try to repair a file system on a drive with I/O errors.
  4. Be careful about destination partition/device, any data stored there will be overwritten.

How do I use ddrescue command?

In this example rescue a /dev/sda to /dev/sdb

     ## No need to partition /dev/sdb beforehand, but if the partition table on /dev/sda ##
     ## is damaged, you will need to recreate it somehow on /dev/sdb. ##
     ddrescue -f -n /dev/sda /dev/sdb logfile
     ddrescue -d -f -r3 /dev/sda /dev/sdb logfile
     ## get list of partitions on a /dev/sdb ##
     fdisk /dev/sdb
     ## check for errors ##
     fsck -v -f /dev/sdb1
     fsck -v -f /dev/sdb2

Understanding ddrescue command options

  • -f : Overwrite output device or partition.
  • -n : Do not try to split or retry failed blocks.
  • -d : Use direct disc access for input file.
  • -r3 : Exit after given three (3) retries (use -1 as infinity retries).
  • -b2048 : Sector size of input device [default is set to 512].

Example: Rescue a partition in /dev/sda3 to /dev/sdb3 in Linux

 ## You need to create the sdb2 partition with fdisk first. sdb2 should be of appropriate type and size ##
     ddrescue -f -n /dev/sda2 /dev/sdb2 logfile
     ddrescue -d -f -r3 /dev/sda2 /dev/sdb2 logfile
     e2fsck -v -f /dev/sdb2
     mount -o ro /dev/sdb2 /mnt
## Rread rescued files from /mnt ##
     cd /mnt
     ls -l
## Copy files using rsync ## 
     rsync -avr . vivek@server1.cyberciti.biz:/data/resuced/wks01

Example: Rescue/recover a DVD-ROM in /dev/dvdom on a Linux

The syntax is:

     ddrescue -n -b2048 /dev/dvdrom dvd-image logfile
     ddrescue -d -b2048 /dev/dvdrom dvd-image logfile

Please note that if there are no errors (errsize is zero), dvd-image now contains a complete image of the DVD-ROM and you can write it to a blank DVD-ROM on a Linux basedsystem:
# growisofs -Z /dev/dvdrom=/path/to/dvd-image

Example: Resume failed rescue

In this example, while rescuing the whole drive /dev/sda to /dev/sdb, /dev/sda freezes up at position XYZFOOBAR (troubled sector # 7575757542):

 ## /dev/sda freezes here ##
 ddrescue -f /dev/hda /dev/hdb logfile
 ## So restart /dev/sda or reboot the server ##
 ## Restart copy at a safe distance from the troubled sector # 7575757542 ##
 ddrescue -f -i 7575757542 /dev/sda /dev/sdb logfile
 ## Copy backwards down to the troubled sector # 7575757542 ##
 ddrescue -f -R /dev/sda /dev/sdb logfile

A note about dd_rescue command and syntax

On Debian / Ubuntu and a few other distro you end up installing other utility called dd_rescue. dd_rescue is a program that copies data from one file or block device to another, it is a tool to help you to save data from crashed partition.

Examples: dd_rescue

To make exact copy of /dev/sda (damaged) to /dev/sdb (make sure sdb is empty) you need to type following command:
# ddrescue /dev/sda /dev/sdb
Naturally, next step is to run fsck on /dev/sdb partition to recover/save data. Remember do not touch originally damaged /dev/sda. If this procedure fails you can send your disk to professional data recovery service. For example if /home (user data) is on /dev/sda2, you need to run a command on /dev/sdb2:
# fsck /dev/sdb2

Once fsck run, mount /dev/sdb2 somewhere and see if you can access the data:
# mount /dev/sdb2 /mnt/data
Finally, take backup using tar or any other command of your own choice. ddrescue command supports tons of options, read man page for more information:
# man dd_rescue
OR see gnu/ddrescue command man page:
# man ddrescue

GitLab CI: Deployment & Environments

This post is a success story of one imaginary news portal, and you’re the happy owner, the editor, and the only developer. Luckily, you already host your project code on GitLab.com and know that you can run tests with GitLab CI. Now you’re curious if it can be used for deployment, and how far can you go with it.

To keep our story technology stack-agnostic, let’s assume that the app is just a set of HTML files. No server-side code, no fancy JS assets compilation.

Destination platform is also simplistic – we will use Amazon S3.

The goal of the article is not to give you a bunch of copypasteable snippets. The goal is to show principles and features of GitLab CI so that you could easily apply them to your technology stack.

Let’s start from the beginning: no CI in our story yet.

A Starting Point

Deployment: in your case, it means that a bunch of HTML files should appear on your S3 bucket (which is already configured for static website hosting).

There’re a million ways to do it. We’ll use the awscli library, provided by Amazon.

The full command looks like this:

aws s3 cp ./ s3://yourbucket/ --recursive --exclude "*" --include "*.html"

Manual deploymentPushing code to repository & deploing are separate processes

Important detail: the command expects you to provide AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. Also you might need to specify AWS_DEFAULT_REGION.

Let’s try to automate it using GitLab CI.

First Automated Deployment

With GitLab, there’s no difference on what commands to run. You can setup GitLab CI according to your needs as if it was your local terminal on your computer. As long as you execute commands there, you can tell CI to do the same for you in GitLab. Put your script to.gitlab-ci.yml and push your code – that’s it: CI triggers a job and your commands are executed.

Let’s add some context to our story: our website is small, there is 20-30 daily visitors and the code repository has only one branch:master.

Let’s start by specifying a job with the command from above in .gitlab-ci.yml:

  script: aws s3 cp ./ s3://yourbucket/ --recursive --exclude "*" --include "*.html"

No luck: Failed command

It is our job to ensure that there is an aws executable. To install awscli we need pip, which is a tool for Python packages installation. Let’s specify Docker image with preinstalled Python, which should contain pip as well:

  image: python:latest
  - pip install awscli
  - aws s3 cp ./ s3://yourbucket/ --recursive --exclude "*" --include "*.html"

Automated deploymentYou push your code to GitLab, and it is automatically deployed by CI

The installation of awscli extends the job execution time, but it is not a big deal for now. If you need to speed up the process, you can always look for a Docker image with preinstalled awscli, or create an image by yourself.

Also, let’s not forget about these environment variables, which you’ve just grabbed from AWS Console:

  image: python:latest
  - pip install awscli
  - aws s3 cp ./ s3://yourbucket/ --recursive --exclude "*" --include "*.html"

It should work, however keeping secret keys open, even in a private repository, is not a good idea. Let’s see how to deal with it.

Keeping Secret Things Secret

GitLab has a special place for secret variables: Settings > Variables

Picture of Variables page

Whatever you put there will be turned into environment variables. Only an administrator of a project has access to this section.

We could remove variables section from our CI configuration. However, let’s use it for another purpose.

Specifying and Using Non-secret Variables

When your configuration gets bigger, it is convenient to keep some of the parameters as variables at the beginning of your configuration. Especially if you use them in multiple places. Although it is not the case in our situation yet, let’s set the S3 bucket name as a variable, for demonstration purposes:

  S3_BUCKET_NAME: "yourbucket"
  image: python:latest
  - pip install awscli
  - aws s3 cp ./ s3://$S3_BUCKET_NAME/ --recursive --exclude "*" --include "*.html"

So far so good:

Successful build

Because the audience of your website grew, you’ve hired a developer to help you. Now you have a team. Let’s see how teamwork changes the workflow.

Dealing with Teamwork

Now there’s two of you working in the same repository. It is no longer convenient to use the master branch for development. You decide to use separate branches for both new features and new articles and merge them into master when they are ready.

The problem is that your current CI config doesn’t care about branches at all. Whenever you push anything to GitLab, it will be deployed to S3.

Preventing it is straightforward. Just add only: master to your deploy job.

Automated deployment of master branchYou don’t want to deploy every branch to the production website

But it would also be nice to preview your changes from feature-branches somehow.

Setting Up a Separate Place for Testing

Patrick (the guy you recently hired) reminds you that there is such a thing called GitLab Pages. It looks like a perfect candidate for a place to preview your work in progress.

To host websites on GitLab Pages your CI configuration should satisfy these simple rules:

  • The job should be named pages
  • There should be an artifacts section with folder public in it
  • Everything you want to host should be in this public folder

The contents of the public folder will be hosted at http://<username>.gitlab.io/<projectname>/

After applying the example config for plain-html websites, the full CI configuration looks like this:

  S3_BUCKET_NAME: "yourbucket"

  image: python:latest
  - pip install awscli
  - aws s3 cp ./ s3://$S3_BUCKET_NAME/ --recursive --exclude "*" --include "*.html"
  - master

  image: alpine:latest
  - mkdir -p ./public
  - cp ./*.html ./public/
    - public
  - master

We specified two jobs. One job deploys the website for your customers to S3 (deploy). The other one (pages) deploys the website to GitLab Pages. We can name them “Production environment” and “Staging environment”, respectively.

Deployment to two placesAll branches, except master, will be deployed to GitLab Pages

Introducing Environments

GitLab offers support for environments, and all you need to do it to specify the corresponding environment for each deployment job:

  S3_BUCKET_NAME: "yourbucket"

deploy to production:
  environment: production
  image: python:latest
  - pip install awscli
  - aws s3 cp ./ s3://$S3_BUCKET_NAME/ --recursive --exclude "*" --include "*.html"
  - master

  image: alpine:latest
  environment: staging
  - mkdir -p ./public
  - cp ./*.html ./public/
    - public
  - master

GitLab keeps track of your deployments, so you always know what is currently being deployed on your servers:

List of environments

GitLab provides full history of your deployments per every environment:

List of deployments to staging environment


Now, with everything automated and set up, we’re ready for the new challenges that are just around the corner.

Deal with Teamwork Part 2

It has just happened again. You’ve pushed your feature-branch to preview it on staging; a minute later Patrick pushed his branch, so the Staging was re-written with his work. Aargh!! It was the third time today!

Idea! Let’s use Slack to notify us of deployments, so that people will not push their stuff if another one has been just deployed!

Slack Notifications

Setting up Slack notifications is a straightforward process.

The whole idea is to take the Incoming WebHook URL from Slack… Grabbing Incoming WebHook URL in Slack

…and put it into Settings > Services > Slack together with your Slack username: Configuring Slack Service in GitLab

Since the only thing you want to be notified of is deployments, you can uncheck all the checkboxes except the “Build” in the settings above. That’s it. Now you’re notified for every deployment:

Deployment notifications in Slack

Teamwork at Scale

As the time passed, your website became really popular, and your team has grown from 2 to 8 people. People develop in parallel, so the situation when people wait for each other to preview something on Staging has become pretty common. “Deploy every branch to staging” stopped working.

Queue of branches for review on Staging

It’s time to modify the process one more time. You and your team agreed that if someone wants to see his/her changes on the staging server, he/she should first merge the changes to the “staging” branch.

The change of .gitlab-ci.yml is minimal:

- master

is now changed to

- staging

Staging branchPeople have to merge their feature branches before preview on Staging

Of course, it requires additional time and effort for merging, but everybody agreed that it is better than waiting.

Handling Emergencies

You can’t control everything, so sometimes things go wrong. Someone merged branches incorrectly and pushed the result straight to production exactly when your site was on top of HackerNews. Thousands of people saw your completely broken layout instead of your shiny main page.

Luckily, someone found the Rollback button, so the website was fixed a minute after the problem was discovered.

List of environmentsRollback relaunches the previous job with the previous commit

Anyway, you felt that you needed to react to the problem and decided to turn off auto deployment to production and switch to manual deployment. To do that, you needed to add when: manual to your job.

As you expected, there will be no automatic deployment to Production after that. To deploy manually go to Pipelines > Builds, and click the button:

Skipped job is available for manual launch

Finally, your company has turned into a corporation. You have hundreds of people working on the website, so all the previous compromises are not working anymore.

Review Apps

The next logical step is to boot up a temporary instance of the application per feature-branch for review.

In our case, we set up another bucket on S3 for that. The only difference that we copy the contents of our website to a “folder” named by a name of the development branch, so that the URL looks like this:


Here’s the replacement for the pages job we used before:

review apps:
    S3_BUCKET_NAME: "reviewbucket"
  image: python:latest
  environment: review
  - pip install awscli
  - mkdir -p ./$CI_BUILD_REF_NAME
  - cp ./*.html ./$CI_BUILD_REF_NAME/
  - aws s3 cp ./ s3://$S3_BUCKET_NAME/ --recursive --exclude "*" --include "*.html"

The interesting thing is where we got this $CI_BUILD_REF_NAME variable from. GitLab predefines many environment variables so that you can use them in your jobs.

Note that we defined the S3_BUCKET_NAME variable inside the job. You can do this to rewrite top-level definitions.

Visual representation of this configuration: Review apps

The details of Review Apps implementation depend widely on your real technology stack and on your deployment process, which is out of the scope of this blog post.

It will not be that straightforward, as it is with our static HTML website. For example, you had to make these instances temporary, and booting up these instances with all required software and services automatically on the fly is not a trivial task. However, it is doable, especially if you use Docker, or at least Chef or Ansible.

We’ll cover deployment with Docker in another blog post. To be fair, I feel a bit guilty for simplifying the deployment process to a simple HTML files copying, and not adding some hardcore scenarios. If you need some right now, I recommend you to read article “Building an Elixir Release into a Docker image using GitLab CI”

For now, let’s talk about one final thing.

Deploying to Different Platforms

In real life, we are not limited to S3 and GitLab Pages. We host, and therefore, deploy our apps and packages to various services.

Moreover, at some point, you could decide to move to a new platform and thus need to rewrite all your deployment scripts. You can use a gem called dpl to minimize the damage.

In the examples above we used awscli as a tool to deliver code to an example service (Amazon S3). However, no matter what tool and what destination system you use, the principle is the same: you run a command with some parameters and somehow pass a secret key for authentication purposes.

The dpl deployment tool utilizes this principle and provides a unified interface for this list of providers.

Here’s how a production deployment job would look if we use dpl:

  S3_BUCKET_NAME: "yourbucket"

deploy to production:
  environment: production
  image: ruby:latest
  - gem install dpl
  - dpl --provider=s3 --bucket=$S3_BUCKET_NAME
  - master

If you deploy to different systems or change destination platform frequently, consider using dpl to make your deployment scripts look uniform.


  1. Deployment is just a command (or a set of commands) that is regularly executed. Therefore it can run inside GitLab CI
  2. Most times you’ll need to provide some secret key(s) to the command you execute. Store these secret keys in Settings > Variables
  3. With GitLab CI, you can flexibly specify which branches to deploy to
  4. If you deploy to multiple environments, GitLab will conserve the history of deployments, which gives you the ability to rollback to any previous version
  5. For critical parts of your infrastructure, you can enable manual deployment from GitLab interface, instead of automated deployment

An Introduction to Cloud Hosting


Cloud hosting is a method of using online virtual servers that can be created, modified, and destroyed on demand. Cloud servers are allocated resources like CPU cores and memory by the physical server that it’s hosted on and can be configured with a developer’s choice of operating system and accompanying software. Cloud hosting can be used for hosting websites, sending and storing emails, and distributing web-based applications and other services.

In this guide, we will go over some of the basic concepts involved in cloud hosting, including how virtualization works, the components in a virtual environment, and comparisons with other common hosting methods.

What is “the Cloud”?

“The Cloud” is a common term that refers to servers connected to the Internet that are available for public use, either through paid leasing or as part of a software or platform service. A cloud-based service can take many forms, including web hosting, file hosting and sharing, and software distribution. “The Cloud” can also be used to refer to cloud computing, which is the practice of using several servers linked together to share the workload of a task. Instead of running a complex process on a single powerful machine, cloud computing distributes the task across many smaller computers.

Other Hosting Methods

Cloud hosting is just one of many different types of hosting available to customers and developers today, though there are some key differences between them. Traditionally, sites and apps with low budgets and low traffic would use shared hosting, while more demanding workloads would be hosted on dedicated servers.

Shared hosting is the most common and most affordable way to get a small and simple site up and running. In this scenario, hundreds or thousands of sites share a common pool of server resources, like memory and CPU. Shared hosting tends to offer the most basic and inflexible feature and pricing structures, as access to the site’s underlying software is very limited due to the shared nature of the server.

Dedicated hosting is when a physical server machine is sold or leased to a single client. This is more flexible than shared hosting, as a developer has full control over the server’s hardware, operating system, and software configuration. Dedicated servers are common among more demanding applications, such as enterprise software and commercial services like social media, online games, and development platforms.

How Virtualization Works

Cloud hosting environments are broken down into two main parts: the virtual servers that apps and websites can be hosted on and the physical hosts that manage the virtual servers. This virtualization is what is behind the features and advantages of cloud hosting: the relationship between host and virtual server provides flexibility and scaling that are not available through other hosting methods.

Virtual Servers

The most common form of cloud hosting today is the use of a virtual private server, or VPS. A VPS is a virtual server that acts like a real computer with its own operating system. While virtual servers share resources that are allocated to them by the host, their software is well isolated, so operations on one VPS won’t affect the others.

Virtual servers are deployed and managed by the hypervisor of a physical host. Each virtual server has an operating system installed by the hypervisor and available to the user to add software on top of. For many practical purposes, a virtual server is identical in use to a dedicated physical server, though performance may be lower in some cases due to the virtual server sharing physical hardware resources with other servers on the same host.


Resources are allocated to a virtual server by the physical server that it is hosted on. This host uses a software layer called a hypervisor to deploy, manage, and grant resources to the virtual servers that are under its control. The term “hypervisor” is often used to refer to the physical hosts that hypervisors (and their virtual servers) are installed on.

The host is in charge of allocating memory, CPU cores, and a network connection to a virtual server when one is launched. An ongoing duty of the hypervisor is to schedule processes between the virtual CPU cores and the physical ones, since multiple virtual servers may be utilizing the same physical cores. The method of choice for process scheduling is one of the key differences between different hypervisors.


There are a few common hypervisor software available for cloud hosts today. These different virtualization methods have some key differences, but they all provide the tools that a host needs to deploy, maintain, move, and destroy virtual servers as needed.

KVM, short for “Kernel-Based Virtual Machine”, is a virtualization infrastructure that is built in to the Linux kernel. When activated, this kernel module turns the Linux machine into a hypervisor, allowing it to begin hosting virtual servers. This method is in contrast from how other hypervisors usually work, as KVM does not need to create or emulate kernel components that are used for virtual hosting.

Xen is one of the most common hypervisors in use today. Unlike KVM, Xen uses a microkernel, which provides the tools needed to support virtual servers without modifying the host’s kernel. Xen supports two distinct methods of virtualization: paravirtualization, which skips the need to emulate hardware but requires special modifications made to the virtual servers’ operating system, and hardware-assisted virtualization, which uses special hardware features to efficiently emulate a virtual server so that they can use unmodified operating systems.

ESXi is an enterprise-level hypervisor offered by VMware. ESXi is unique in that it doesn’t require the host to have an underlying operating system. This is referred to as a “type 1” hypervisor and is extremely efficient due to the lack of a “middleman” between the hardware and the virtual servers. With type 1 hypervisors like ESXi, no operating system needs to be loaded on the host because the hypervisor itself acts as the operating system.

Hyper-V is one of the most popular methods of virtualizing Windows servers and is available as a system service in Windows Server. This makes Hyper-V a common choice for developers working within a Windows software environment. Hyper-V is included in Windows Server 2008 and 2012 and is also available as a stand-alone server without an existing installation of Windows Server.

Why Cloud Hosting?

The features offered by virtualization lend themselves well to a cloud hosting environment. Virtual servers can be configured with a wide range of hardware resource allocations, and can often have resources added or removed as needs change over time. Some cloud hosts can move a virtual server from one hypervisor to another with little or no downtime or duplicate the server for redundancy in case of a node failure.


Developers often prefer to work in a VPS due to the control that they have over the virtual environment. Most virtual servers running Linux offer access to the root (administrator) account or sudo privileges by default, giving a developer the ability to install and modify whatever software they need.

This freedom of choice begins with the operating system. Most hypervisors are capable of hosting nearly any guest operating system, from open source software like Linux and BSD to proprietary systems like Windows. From there, developers can begin installing and configuring the building blocks needed for whatever they are working on. A cloud server’s configurations might involve a web server, database, email service, or an app that has been developed and is ready for distribution.


Cloud servers are very flexible in their ability to scale. Scaling methods fall into two broad categories: horizontal scaling and vertical scaling. Most hosting methods can scale one way or the other, but cloud hosting is unique in its ability to scale both horizontally and vertically. This is due to the virtual environment that a cloud server is built on: since its resources are an allocated portion of a larger physical pool, it’s easy to adjust these resources or duplicate the virtual image to other hypervisors.

Horizontal scaling, often referred to as “scaling out”, is the process of adding more nodes to a clustered system. This might involve adding more web servers to better manage traffic, adding new servers to a region to reduce latency, or adding more database workers to increase data transfer speed. Many newer web utilities, like CoreOS, Docker, and Couchbase, are built around efficient horizontal scaling.

Vertical scaling, or “scaling up”, is when a single server is upgraded with additional resources. This might be an expansion of available memory, an allocation of more CPU cores, or some other upgrade that increases that server’s capacity. These upgrades usually pave the way for additional software instances, like database workers, to operate on that server. Before horizontal scaling became cost-effective, vertical scaling was the method of choice to respond to increasing demand.

With cloud hosting, developers can scale depending on their application’s needs — they can scale out by deploying additional VPS nodes, scale up by upgrading existing servers, or do both when server needs have dramatically increased.


By now, you should have a decent understanding of how cloud hosting works, including the relationship between hypervisors and the virtual servers that they are responsible for, as well as how cloud hosting compares to other common hosting methods. With this information in mind, you can choose the best hosting for your needs.

Managing Data in iOS Apps with SQLite

Almost all apps will need to store data of some form. Maybe you need to save user preferences, progress in a game, or offline data so your app can work without a network connection. Developers have a lot of options for managing data in iOS apps, from Core Data to cloud based storage, but one elegant and reliable local storage option is SQLite.

In this tutorial I will show you how to add SQLite support to your app. You can find the final source code on GitHub.

Getting Started

The SQLite library is written in C, and all queries happen as calls to C functions. This makes it challenging to use, as you have to be aware of pointers and data types etc. To help, you can make use of Objective-C or Swift wrappers to serve as an adapter layer.

A popular choice is FMDB, an Objective-C wrapper around SQLite. Its easy to use, but personally I prefer to not use hard-coded SQL (Structured Query Language) commands. For this tutorial, I will use SQLite.swiftto create a basic contact list.

First, create a new single view project in Xcode (SQLite.swift requires Swift 2 and Xcode 7 or greater). I created a ViewController in Main.storyboard that looks like the below. Create your own similar layout, ordownload the storyboard files here.

App Preview

At the bottom is a TableView which will hold the contacts.


You can install SQLite.swift with Carthage, CocoaPods, or manually.

The Model

Create a new Swift file / class named Contact.swift, it contains three properties and initializers to keep it simple.

import Foundation

class Contact {
    let id: Int64?
    var name: String
    var phone: String
    var address: String

    init(id: Int64) {
        self.id = id
        name = ""
        phone = ""
        address = ""

    init(id: Int64, name: String, phone: String, address: String) {
        self.id = id
        self.name = name
        self.phone = phone
        self.address = address

The id is required as a parameter when creating an object, so you can reference it in the database later.

Connecting the User Interface

In ViewController.swift make the class implement UITableViewDelegate and UITableViewSourceprotocols.

class ViewController: UIViewController, UITableViewDataSource, UITableViewDelegate {

Connect the following IOutlets with their corresponding views by dragging or manually adding them in code.

@IBOutlet weak var nameTextField: UITextField!
@IBOutlet weak var phoneTextField: UITextField!
@IBOutlet weak var addressTextField: UITextField!

@IBOutlet weak var contactsTableView: UITableView!

Add outlets

Now you will need a list of contacts, and an index for the contact selected from the list.

private var contacts = [Contact]()
private var selectedContact: Int?

Link the DataSource and Delegate of the UITableView with the UIViewController in the storyboard.

Add datasource

Or by adding the following lines into the viewDidLoad() method of ViewController.swift.

contactsTableView.dataSource = self
contactsTableView.delegate = self

To insert, update and remove elements from the UITableView you need to implement three basic methods from the protocols mentioned above.

The first will fill the UITextFields with the corresponding contact information from a selected contact. Yt will then save the row that represents this contact in the table.

func tableView(tableView: UITableView, didSelectRowAtIndexPath indexPath: NSIndexPath) {
    nameTextField.text = contacts[indexPath.row].name
    phoneTextField.text = contacts[indexPath.row].phone
    addressTextField.text = contacts[indexPath.row].address

    selectedContact = indexPath.row

The next function tells the UITableViewDataSource how many cells of data it should load. For now, it will be zero since the array is empty.

func tableView(tableView: UITableView, numberOfRowsInSection section: Int) -> Int {
    return contacts.count

The last function returns a specific UITableViewCell for each row. First get the cell using the identifier, then its child views using their tag. Make sure that the identifiers match your element names.

func tableView(tableView: UITableView, cellForRowAtIndexPath indexPath: NSIndexPath) -> UITableViewCell {

    let cell = tableView.dequeueReusableCellWithIdentifier("ContactCell")!
    var label: UILabel
    label = cell.viewWithTag(1) as! UILabel // Name label
    label.text = contacts[indexPath.row].name

    label = cell.viewWithTag(2) as! UILabel // Phone label
    label.text = contacts[indexPath.row].phone

    return cell

The app can now run, but there is no ability to add or edit contacts yet. To do this link the followingIBActions with the corresponding buttons.

@IBAction func addButtonClicked() {
    let name = nameTextField.text ?? ""
    let phone = phoneTextField.text ?? ""
    let address = addressTextField.text ?? ""

    let contact = Contact(id: 0, name: name, phone: phone, address: address)
    contactsTableView.insertRowsAtIndexPaths([NSIndexPath(forRow: contacts.count-1, inSection: 0)], withRowAnimation: .Fade)

Here you take the values of the UITextFields, and create an object which is added to the contacts list. The id is set to 0, since you haven’t implemented the database yet. The functioninsertRowsAtIndexPaths() takes as arguments an array of indexes of the rows that will be affected, and the animation to perform with the change.

@IBAction func updateButtonClicked() {
    if selectedContact != nil {
        let id = contacts[selectedContact].id!
        let contact = Contact(
            id: id,
            name: nameTextField.text ?? "",
            phone: phoneTextField.text ?? "",
            address: addressTextField.text ?? "")

                contacts.insert(contact, atIndex: selectedContact!)

    } else {
    print("No item selected")

In this function you create a new Contact, and delete and re-insert in the same index of the list to make the replacement. The function doesn’t currently check to see if the data has changed.

@IBAction func deleteButtonClicked() {
    if selectedContact != nil {

        contactsTableView.deleteRowsAtIndexPaths([NSIndexPath(forRow: selectedContact, inSection: 0)], withRowAnimation: .Fade)
    } else {
    print("No item selected")

The last function removes the contact selected and refreshes the table.

At this point the application works, but will lose all changes when relaunched.

Creating a Database

Now time to manage the database. Create a new swift file / class named StephencelisDB.swift and import the SQLite library.

import SQLite

class StephencelisDB {

First, initialize an instance of the class, using the ‘Singleton’ pattern. Then, declare an object of typeConnection, which is the actual database object you will call.

static let instance = StephencelisDB()
private let db: Connection?

The other declarations are the table of contacts, and its column with a specific type.

private let contacts = Table("contacts")
private let id = Expression<Int64>("id")
private let name = Expression<String?>("name")
private let phone = Expression<String>("phone")
private let address = Expression<String>("address")

The constructor tries to open a connection with the database which has a specified name, and a path to the application data, and then creates the tables.

private init() {
    let path = NSSearchPathForDirectoriesInDomains(
        .DocumentDirectory, .UserDomainMask, true

    do {
        db = try Connection("\(path)/Stephencelis.sqlite3")
    } catch {
        db = nil
        print ("Unable to open database")


func createTable() {
    do {
        try db!.run(contacts.create(ifNotExists: true) { table in
        table.column(id, primaryKey: true)
        table.column(phone, unique: true)
    } catch {
        print("Unable to create table")

Notice there is no SQL code to create the table and columns. This is the power of the wrapper used. With a few lines of code you have the database ready.

CRUD Operations

For those unfamiliar with the term, ‘CRUD’ is an acronym for Create-Read-Update-Delete. Next, add the four methods to the database class that perform these operations.

func addContact(cname: String, cphone: String, caddress: String) -> Int64? {
    do {
        let insert = contacts.insert(name <- cname, phone <- cphone, address <- caddress)
        let id = try db!.run(insert)

        return id
    } catch {
        print("Insert failed")
        return -1

The <- operator assigns values to the corresponding columns as you would in a normal query. The runmethod will execute these queries and statements. The id of the row inserted is returned from the method.

Add print(insert.asSQL()) to see the executed query itself:

INSERT INTO "contacts" ("name", "phone", "address") VALUES ('Deivi Taka', '+355 6X XXX XXXX', 'Tirana, Albania')

If you want to undertake further debugging you can use a method instead. The prepare method returns a list of all the rows in the specified table. You loop through these rows and create an array of Contactobjects with the column content as parameters. If this operation fails, an empty list is returned.

func getContacts() -> [Contact] {
    var contacts = [Contact]()

    do {
        for contact in try db!.prepare(self.contacts) {
            id: contact[id],
            name: contact[name]!,
            phone: contact[phone],
            address: contact[address]))
    } catch {
        print("Select failed")

    return contacts

For deleting items, find the item with a given id, and remove it from the table.

func deleteContact(cid: Int64) -> Bool {
    do {
        let contact = contacts.filter(id == cid)
        try db!.run(contact.delete())
        return true
    } catch {
        print("Delete failed")
    return false

You can delete more than one item at once by filtering results to a certain column value.

Updating has similar logic.

func updateContact(cid:Int64, newContact: Contact) -> Bool {
    let contact = contacts.filter(id == cid)
    do {
        let update = contact.update([
            name <- newContact.name,
            phone <- newContact.phone,
            address <- newContact.address
        if try db!.run(update) > 0 {
            return true
    } catch {
        print("Update failed: \(error)")

    return false

Final Changes

After setting up the database managing class, there are some remaining changes needed toViewcontroller.swift.

First, when the view is loaded get the previously saved contacts.

contacts = StephencelisDB.instance.getContacts()

The tableview methods you prepared earlier will display the saved contacts without adding anything else.

Inside addButtonClicked, call the method to add a contact to the database. Then update the tableview only if the method returned a valid id.

if let id = StephencelisDB.instance.addContact(name, cphone: phone, caddress: address) {
    // Add contact in the tableview

In a similar way, call these methods inside updateButtonClicked and deleteButtonClicked.

StephencelisDB.instance.updateContact(id, newContact: contact)

Run the app and try to perform some actions. Below are two screenshots of how it should look. To update or delete a contact it must first be selected.



Any Queries?

SQLite is a good choice for working with local data, and is used by many apps and games. Wrappers like SQLite.swift make the implementation easier by avoiding the use of hardcoded SQL queries. If you need to store data in your app and don’t want to have to handle more complex options then SQLite i worth considering.

Redis or Memcached for caching

Memcached or Redis? It’s a question that nearly always arises in any discussion about squeezing more performance out of a modern, database-driven Web application. When performance needs to be improved, caching is often the first step taken, and Memcached or Redis are typically the first places to turn.

These renowned cache engines share a number of similarities, but they also have important differences. Redis, the newer and more versatile of the two, is almost always the superior choice.

The similarities

Let’s start with the similarities. Both Memcached and Redis serve as in-memory, key-value data stores, although Redis is more accurately described as a data structure store. Both Memcached and Redis belong to the NoSQL family of data management solutions, and both are based on a key-value data model. They both keep all data in RAM, which of course makes them supremely useful as a caching layer. In terms of performance, the two data stores are also remarkably similar, exhibiting almost identical characteristics (and metrics) with respect to throughput and latency.

Both Memcached and Redis are mature and hugely popular open source projects. Memcached was originally developed by Brad Fitzpatrick in 2003 for the LiveJournal website. Since then, Memcached has been rewritten in C (the original implementation was in Perl) and put in the public domain, where it has become a cornerstone of modern Web applications. Current development of Memcached is focused on stability and optimizations rather than adding new features.

Redis was created by Salvatore Sanfilippo in 2009, and Sanfilippo remains the lead developer of the project today. Redis is sometimes described as “Memcached on steroids,” which is hardly surprising considering that parts of Redis were built in response to lessons learned from using Memcached. Redis has more features than Memcached and is, thus, more powerful and flexible.

Used by many companies and in countless mission-critical production environments, both Memcached and Redis are supported by client libraries in every conceivable programming language, and it’s included in a multitude of packages for developers. In fact, it’s a rare Web stack that does not include built-in support for either Memcached or Redis.

Why are Memcached and Redis so popular? Not only are they extremely effective, they’re also relatively simple. Getting started with either Memcached or Redis is considered easy work for a developer. It takes only a few minutes to set up and get them working with an application. Thus, a small investment of time and effort can have an immediate, dramatic impact on performance — usually by orders of magnitude. A simple solution with a huge benefit; that’s as close to magic as you can get.

When to use Memcached

Because Redis is newer and has more features than Memcached, Redis is almost always the better choice. However, Memcached could be preferable when caching relatively small and static data, such as HTML code fragments. Memcached’s internal memory management, while not as sophisticated as that of Redis, is more efficient in the simplest use cases because it consumes comparatively less memory resources for metadata. Strings (the only data type supported by Memcached) are ideal for storing data that’s only read, because strings require no further processing.

That said, Memcached’s memory management efficiency diminishes quickly when data size is dynamic, at which point Memcached’s memory can become fragmented. Also, large data sets often involve serialized data, which always requires more space to store. While Memcached is effectively limited to storing data in its serialized form, the data structures in Redis can store any aspect of the data natively, thus reducing serialization overhead.

The second scenario in which Memcached has an advantage over Redis is in scaling. Because Memcached is multithreaded, you can easily scale up by giving it more computational resources, but you will lose part or all of the cached data (depending on whether you use consistent hashing). Redis, which is mostly single-threaded, can scale horizontally via clustering without loss of data. Clustering is an effective scaling solution, but it is comparatively more complex to set up and operate.

When to use Redis

You’ll almost always want to use Redis because of its data structures. With Redis as a cache, you gain a lot of power (such as the ability to fine-tune cache contents and durability) and greater efficiency overall. Once you use the data structures, the efficiency boost becomes tremendous for specific application scenarios.

Redis’ superiority is evident in almost every aspect of cache management. Caches employ a mechanism called data eviction to make room for new data by deleting old data from memory. Memcached’s data eviction mechanism employs a Least Recently Used algorithm and somewhat arbitrarily evicts data that’s similar in size to the new data.

Redis, by contrast, allows for fine-grained control over eviction, letting you choose from six different eviction policies. Redis also employs more sophisticated approaches to memory management and eviction candidate selection. Redis supports both lazy and active eviction, where data is evicted only when more space is needed or proactively. Memcached, on the other hand, provides lazy eviction only.

Redis gives you much greater flexibility regarding the objects you can cache. While Memcached limits key names to 250 bytes and works with plain strings only, Redis allows key names and values to be as large as 512MB each, and they are binary safe. Plus, Redis has five primary data structures to choose from, opening up a world of possibilities to the application developer through intelligent caching and manipulation of cached data.

Beyond caching

Using Redis data structures can simplify and optimize several tasks — not only while caching, but even when you want the data to be persistent and always available. For example, instead of storing objects as serialized strings, developers can use a Redis Hash to store an object’s fields and values, and manage them using a single key. Redis Hash saves developers the need to fetch the entire string, deserialize it, update a value, reserialize the object, and replace the entire string in the cache with its new value for every trivial update — that means lower resource consumption and increased performance.

Other data structures offered by Redis (such as lists, sets, sorted sets, hyperloglogs, bitmaps, and geospatial indexes) can be used to implement even more complex scenarios. Sorted sets for time-series data ingestion and analysis is another example of a Redis data structure that offers enormously reduced complexity and lower bandwidth consumption.

Another important advantage of Redis is that the data it stores isn’t opaque, so the server can manipulate it directly. A considerable share of the 180-plus commands available in Redis are devoted to data processing operations and embedding logic in the data store itself via server-side Lua scripting. These built-in commands and user scripts give you the flexibility of handling data processing tasks directly in Redis without having to ship data across the network to another system for processing.

Redis offers optional and tunable data persistence designed to bootstrap the cache after a planned shutdown or an unplanned failure. While we tend to regard the data in caches as volatile and transient, persisting data to disk can be quite valuable in caching scenarios. Having the cache’s data available for loading immediately after restart allows for much shorter cache warm-up and removes the load involved in repopulating and recalculating cache contents from the primary data store.

Data replication too

Redis can also replicate the data that it manages. Replication can be used for implementing a highly available cache setup that can withstand failures and provide uninterrupted service to the application. A cache failure falls only slightly short of application failure in terms of the impact on user experience and application performance, so having a proven solution that guarantees the cache’s contents and service availability is a major advantage in most cases.

Last but not least, in terms of operational visibility, Redis provides a slew of metrics and a wealth of introspective commands with which to monitor and track usage and abnormal behavior. Real-time statistics about every aspect of the database, the display of all commands being executed, the listing and managing of client connections — Redis has all that and more.

When developers realize the effectiveness of Redis’ persistence and in-memory replication capabilities, they often use it as a first-responder database, usually to analyze and process high-velocity data and provide responses to the user while a secondary (often slower) database maintains a historical record of what happened. When used in this manner, Redis can also be ideal for analytics use cases.

Redis for analytics

Three analytics scenarios come immediately to mind. In the first scenario, when using something like Apache Spark to iteratively process large data sets, you can use Redis as a serving layer for data previously calculated by Spark. In the second scenario, using Redis as your shared, in-memory, distributed data store canaccelerate Spark processing speeds by a factor of 45 to 100. Finally, an all too common scenario is one in which reports and analytics need to be customizable by the user, but retrieving data from inherently batch data stores (like Hadoop or an RDBMS) takes too long. In this case, an in-memory data structure store such as Redis is the only practical way of getting submillisecond paging and response times.

When using extremely large operational data sets or analytics workloads, running everything in-memory might not be cost effective. To achieve submillisecond performance at lower cost, Redis Labs created a version of Redis that runs on a combination of RAM and flash, with the option to configure RAM-to-flash ratios. While this opens up several new avenues to accelerate workload processing, it also gives developers the option to simply run their “cache on flash.”

Open source software continues to provide some of the best technologies available today. When it comes to boosting application performance through caching, Redis and Memcached are the most established and production-proven candidates. However, given its richer functionality, more advanced design, many potential uses, and greater cost efficiency at scale, Redis should be your first choice in nearly every case.

Top Linux Server Distributions

You know that Linux is a hot data center server. You know it can save you money in licensing and maintenance costs. But that still leaves the question of what your best options are for Linux as a server operating system.

We have listed the top Linux Server distributions based on the following characteristics:

  1. Ease of installation and use
  2. Cost
  3. Available commercial support
  4. Data center reliability
Ubuntu LTS


At the top of almost every Linux-related list, the Debian-based Ubuntu is in a class by itself. Canonical’s Ubuntu surpasses all other Linux server distributions — from its simple installation to its excellent hardware discovery to its world-class commercial support, Ubuntu sets a strong standard that is hard to match.


The latest release of Ubuntu, Ubuntu 16.04 LTS “Xenial Xerus,” debuted in April 2016 and ups the ante with OpenStack Mitaka support, the LXD pure-container hypervisor, and Snappy, an optimized packaging system developed specifically for working with newer trends and technologies such as containers, mobile and the Internet of Things (IoT).

The LTS in Ubuntu 16.04 LTS stands for Long Term Support. The LTS versions are released every two years and include five years of commercial support for the Ubuntu Server edition.

Red Hat Enterprise Linux

Red Hat Enterprise Linux

While Red Hat started out as the “little Linux company that could,” its Red Hat Enterprise Linux (RHEL) server operating system is now a major force in the quest for data center rackspace. The Linux darling of large companies throughout the world, Red Hat’s innovations and non-stop support, including ten years of support for major releases, will keep you coming back for more.
Red Hat
RHEL is based on the community-driven Fedora, which Red Hat sponsors. Fedora is updated more frequently than RHEL and serves as more of a bleeding-edge Linux distro in terms of features and technology, but it doesn’t offer the stability or the length and quality of commercial support that RHEL is renowned for.In development since 2010, Red Hat Enterprise Linux 7 (RHEL 7) made its official debut in June 2014, and the major update offers scalability improvements for enterprises, including a new filesystem that can scale to 500 terabytes, as well as support for Docker container virtualization technology. The most recent release of RHEL, version 7.2, arrived in November 2015.
SUSE Linux Enterprise Server

SUSE Linux Enterprise Server

The Micro Focus-owned (but independently operated) SUSE Linux Enterprise Server (SLES) is stable, easy to maintain and offers 24×7 rapid-response support for those who don’t have the time or patience for lengthy troubleshooting calls. And the SUSE consulting teams will have you meeting your SLAs and making your accountants happy to boot.
SUSE Linux
Similar to how Red Hat’s RHEL is based on the open-source Fedora distribution, SLES is based on the open-source openSUSE Linux distro, with SLES focusing on stability and support over leading-edge features and technologies.The most recent major release, SUSE Linux Enterprise Server 12 (SLES 12), debuted in late October 2014 and introduced new features like framework for Docker, full system rollback, live kernel patching enablement and software modules for “increasing data center uptime, improving operational efficiency and accelerating the adoption of open source innovation,” according to SUSE.SLES 12 SP1 (Service Pack 1) followed the initial SLES 12 release in December 2015, and added support for Docker, Network Teaming, Shibboleth and JeOS images.


If you operate a website through a web hosting company, there’s a very good chance your web server is powered by CentOS Linux. This low-cost clone of Red Hat Enterprise Linux isn’t strictly commercial, but since it’s based on RHEL, you can leverage commercial support for it.Short for Community Enterprise Operating System, CentOS
CentOS has largely operated as a community-driven project that used the RHEL code, removed all Red Hat’s trademarks, and made the Linux server OS available for free use and distribution.In 2014 the focus shifted following Red Hat and CentOS announcing they would collaborate going forward and that CentOS would serve to address the gap between the community-innovation-focused Fedora platform and the enterprise-grade, commercially-deployed Red Hat Enterprise Linux platform.CentOS will continue to deliver a community-oriented operating system with a mission of helping users develop and adopt open source technologies on a Linux server distribution that is more consistent and conservative than Fedora’s more innovative role.At the same time, CentOS will remain free, with support provided by the community-led CentOS project rather than through Red Hat. CentOS released CentOS 7.2 in December 2015, which is derived from Red Hat Enterprise Linux 7.2.


If you’re confused by Debian’s inclusion here, don’t be. Debian doesn’t have formal commercial support but you can connect with Debian-savvy consultants around the world via theirConsultants page. Debian originated in 1993 and has spawned more child distributions than any other parent Linux distribution, including Ubuntu, Linux Mint and Vyatta.
Debian remains a popular option for those who value stability over the latest features. The latest major stable version of Debian, Debian 8 “jessie,” was released in April 2015, and it will be supported for five years.Debian 8 marks the switch to the systemd init system over the old SysVinit init system, and includes the latest releses of the Linux Kernel, Apache, LibreOffice, Perl, Python, Xen Hypervisor, GNU Compiler Collection and the GNOME and Xfce desktop environments.The latest update for Debian 8, version 8.4, debuted on April 2nd, 2016.
Oracle Linux

Oracle Linux

If you didn’t know that Oracle produces its own Linux distribution, you’re not alone. Oracle Linux (formerly Oracle Enterprise Linux) is Red Hat Enterprise Linux fortified with Oracle’s own special Kool-Aid as well as various Oracle logos and art added in.Oracle’s Linux competes directly with Red Hat’s Linux server distributions, and does so quite effectively since purchased support through Oracle is half the price of Red Hat’s equivalent model.
 Oracle Linux Server
Optimized for Oracle’s database services, Oracle Linux is a heavy contender in the enterprise Linux market. If you run Oracle databases and want to run them on Linux, you know the drill: Call Oracle.The latest release of Oracle Linux, version 7.2, arrived in November 2015 and is based on RHEL 7.2.
Mageia / Mandriva

Mageia / Mandriva

Mageia is an open-source-based fork of Mandriva Linux that made its debut in 2011. The most recent release, Mageia 5, became available in June 2015, and Mageia 6 is expected to debut in late June 2016.
Mageia and Mandriva Linux
For U.S.-based executive or technical folks, Mageia and its predecessor Mandriva might be a bit foreign. The incredibly well-constructed Mandriva Linux distribution hails from France and enjoys extreme acceptance in Europe and South America. The Mandriva name and its construction derive from the Mandrake Linux and Connectiva Linux distributions.Mageia maintains the strengths of Mandriva while continuing its development with new features and capabilities, as well as support from the community organization Mageia.Org. Mageia updates are typically released on a 9-month release cycle, with each release supported for two cycles (18 months).As for Mandriva Linux, the Mandriva SA company continues its business Linux server projects, which are now based on Mageia code.

Testing a PHP application in GitLab CI

This guide covers basic building instructions for PHP projects.

There are covered two cases: testing using the Docker executor and testing using the Shell executor.

Test PHP projects using the Docker executor

While it is possible to test PHP apps on any system, this would require manual configuration from the developer. To overcome this we will be using the official PHP docker image that can be found in Docker Hub.

This will allow us to test PHP projects against different versions of PHP. However, not everything is plug ‘n’ play, you still need to configure some things manually.

As with every build, you need to create a valid .gitlab-ci.yml describing the build environment.

Let’s first specify the PHP image that will be used for the build process (you can read more about what an image means in the Runner’s lingo reading about Using Docker images).

Start by adding the image to your .gitlab-ci.yml:

image: php:5.6

The official images are great, but they lack a few useful tools for testing. We need to first prepare the build environment. A way to overcome this is to create a script which installs all prerequisites prior the actual testing is done.

Let’s create a ci/docker_install.sh file in the root directory of our repository with the following content:


# We need to install dependencies only for Docker
[[ ! -e /.dockerenv ]] && [[ ! -e /.dockerinit ]] && exit 0

set -xe

# Install git (the php image doesn't have it) which is required by composer
apt-get update -yqq
apt-get install git -yqq

# Install phpunit, the tool that we will use for testing
curl --location --output /usr/local/bin/phpunit https://phar.phpunit.de/phpunit.phar
chmod +x /usr/local/bin/phpunit

# Install mysql driver
# Here you can install any other extension that you need
docker-php-ext-install pdo_mysql

You might wonder what docker-php-ext-install is. In short, it is a script provided by the official php docker image that you can use to easilly install extensions. For more information read the the documentation at https://hub.docker.com/r/_/php/.

Now that we created the script that contains all prerequisites for our build environment, let’s add it in .gitlab-ci.yml:


- bash ci/docker_install.sh > /dev/null


Last step, run the actual tests using phpunit:


  - phpunit --configuration phpunit_myapp.xml


Finally, commit your files and push them to GitLab to see your build succeeding (or failing).

The final .gitlab-ci.yml should look similar to this:

# Select image from https://hub.docker.com/r/_/php/
image: php:5.6

# Install dependencies
- bash ci/docker_install.sh > /dev/null

  - phpunit --configuration phpunit_myapp.xml

Test against different PHP versions in Docker builds

Testing against multiple versions of PHP is super easy. Just add another job with a different docker image version and the runner will do the rest:

# Install dependencies
- bash ci/docker_install.sh > /dev/null

# We test PHP5.6
  image: php:5.6
  - phpunit --configuration phpunit_myapp.xml

# We test PHP7.0 (good luck with that)
  image: php:7.0
  - phpunit --configuration phpunit_myapp.xml

Custom PHP configuration in Docker builds

There are times where you will need to customise your PHP environment by putting your .ini file into/usr/local/etc/php/conf.d/. For that purpose add a before_script action:

- cp my_php.ini /usr/local/etc/php/conf.d/test.ini

Of course, my_php.ini must be present in the root directory of your repository.

Test PHP projects using the Shell executor

The shell executor runs your builds in a terminal session on your server. Thus, in order to test your projects you first need to make sure that all dependencies are installed.

For example, in a VM running Debian 8 we first update the cache, then we install phpunit and php5-mysql:

sudo apt-get update -y
sudo apt-get install -y phpunit php5-mysql

Next, add the following snippet to your .gitlab-ci.yml:

  - phpunit --configuration phpunit_myapp.xml

Finally, push to GitLab and let the tests begin!

Test against different PHP versions in Shell builds

The phpenv project allows you to easily manage different versions of PHP each with its own config. This is specially usefull when testing PHP projects with the Shell executor.

You will have to install it on your build machine under the gitlab-runner user following the upstream installation guide.

Using phpenv also allows to easily configure the PHP environment with:

phpenv config-add my_config.ini

Important note: It seems phpenv/phpenv is abandoned. There is a fork at madumlao/phpenv that tries to bring the project back to life. CHH/phpenv also seems like a good alternative. Picking any of the mentioned tools will work with the basic phpenv commands. Guiding you to choose the right phpenv is out of the scope of this tutorial.

Install custom extensions

Since this is a pretty bare installation of the PHP environment, you may need some extensions that are not currently present on the build machine.

To install additional extensions simply execute:

pecl install <extension>

It’s not advised to add this to .gitlab-ci.yml. You should execute this command once, only to setup the build environment.

Extend your tests

Using atoum

Instead of PHPUnit, you can use any other tool to run unit tests. For example you can use atoum:

- wget http://downloads.atoum.org/nightly/mageekguy.atoum.phar

  - php mageekguy.atoum.phar

Using Composer

The majority of the PHP projects use Composer for managing their PHP packages. In order to execute Composer before running your tests, simply add the following in your .gitlab-ci.yml:


# Composer stores all downloaded packages in the vendor/ directory.
# Do not use the following if the vendor/ directory is commited to
# your git repository.
  - vendor/

# Install composer dependencies
- curl --silent --show-error https://getcomposer.org/installer | php
- php composer.phar install


Access private packages / dependencies

If your test suite needs to access a private repository, you need to configure the SSH keys in order to be able to clone it.

Use databases or other services

Most of the time you will need a running database in order for your tests to run. If you are using the Docker executor you can leverage Docker’s ability to link to other containers. In GitLab Runner lingo, this can be achieved by defining a service.

This functionality is covered in the CI services documentation.

Testing things locally

With GitLab Runner 1.0 you can also test any changes locally. From your terminal execute:

# Check using docker executor
gitlab-ci-multi-runner exec docker test:app

# Check using shell executor
gitlab-ci-multi-runner exec shell test:app

Example project

We have set up an Example PHP Project for your convenience that runs on GitLab.com using our publicly available shared runners.

Want to hack on it? Simply fork it, commit and push your changes. Within a few moments the changes will be picked by a public runner and the build will begin.

Decreasing build time from 8 minutes 33 seconds to just 10 seconds

I setup GitLab to host several projects at work and I have been quite pleased with it. I read that setting GitLab CI for test and deployment was easy so I decided to try it to automatically run the test suite and the sphinx documentation.

I found the official documentation to be quite good to setup a runner so I won’t go into details here. I chose the Docker executor.

Here is my first .gitlab-ci.yml test:

image: python:3.4

  - pip install -r requirements.txt

  stage: test
    - python -m unittest discover -v

Success, it works! Nice. But… 8 minutes 33 seconds build time for a test suite that runs in less than 1 second… that’s a bit long.

Let’s try using some caching to avoid having to download all the pip requirements every time. After googling, I found this post explaining that the cache path must be inside the build directory:

image: python:3.4

  - export PIP_CACHE_DIR="pip-cache"
  - pip install -r requirements.txt

    - pip-cache

  stage: test
    - python -m unittest discover -v

With the pip cache, the build time went down to about 6 minutes. A bit better, but far from acceptable.

Of course I knew the problem was not the download, but the installation of the pip requirements. I use pandas which explains why it takes a while to compile.

So how do you install pandas easily? With conda of course! There are even some nice docker images created by Continuum Analytics ready to be used.

So let’s try again:

image: continuumio/miniconda3:latest

  - conda env create -f environment.yml
  - source activate koopa

  stage: test
    - python -m unittest discover -v

Build time: 2 minutes 55 seconds. Nice but we need some cache to avoid downloading all the packages everytime. The first problem is that the cache path has to be in the build directory. Conda packages are saved in /opt/conda/pkgs by default. A solution is to replace that directory with a link to a local directory. It works but the problem is that Gitlab makes a compressed archive to save and restore the cache which takes quite some time in this case…

How to get a fast cache? Let’s use a docker volume! I modified my /etc/gitlab-runner/config.toml to add two volumes:

  tls_verify = false
  image = "continuumio/miniconda3:latest"
  privileged = false
  disable_cache = false
  volumes = ["/cache", "/opt/cache/conda/pkgs:/opt/conda/pkgs:rw", "/opt/cache/pip:/opt/cache/pip:rw"]

One volume for conda packages and one for pip. My new .gitlab-ci.yml:

image: continuumio/miniconda3:latest

  - export PIP_CACHE_DIR="/opt/cache/pip"
  - conda env create -f environment.yml
  - source activate koopa

  stage: test
    - python -m unittest discover -v

The build time is about 10 seconds!

Just a few days after my tests, GitLab announced GitLab Container Registry. I already thought about building my own docker image and this new feature would make it even easier than before. But I would have to remember to update my image if I change my requirements. Which I don’t have to think about with the current solution.