LXD and Ansible for staging and development

Today I'll discuss a technique I use at my current gig to simulate our production environment using LXD.

LXD is a container hypervisor by Cannonical. It's a bit experimental, but feels a lot more attractive than Docker. It's a system based on LXC, which allows users to run unprivileged containers. Unlike docker, the LXC philosohpy is that an entire OS should be able to run in it, including init system. This is similar to the systemd-nspawn philosophy, which we also evaluated, but because we are on ubuntu LTS, we still use upstart so that was not an option.

LXC is a bit raw, you can simply run rootfs's as containers and that's all it gives. Furthermore, it creates a bridged network and allows you assign IPs and domain names to containers with dnsmasq. Either using DHCP, or fixed IPs.

Because we wanted to create a private network for our containers that simulate our production environment, we didn't opt for docker because we couldn't find a way to easily configure the network like with LXC, which is just writing some dnsmasq config files which most sysadmins are already familiar with.

Because the containers in LXD are just simple ubuntu cloud instances (with upstart and everything), we can just easily provision them with Ansible, which we already use for our production environment. It's a simple case of just creating a new inventory file in Ansible and we're all set.

Importing an image

Before we start, we should make sure the user on your system is in the lxdgroup:

$ newgrp lxd

Lets start with some basics, how do we create a container? We can download images from the image repository using the lxd-images command. Or we can import an existing base image into LXD using lxc image import.

At work we use a predefined base image which is simply a tarball with a rootfs and some cloud-config template files. The cloud-config template files are used for setting the hostname of the container for example.

templates/  
├── cloud-init-meta.tpl
├── cloud-init-user.tpl
├── cloud-init-vendor.tpl
└── upstart-override.tpl
rootfs/  
├── bin
├── boot
├── dev
├── etc
├── home
├── lib
├── lib64
├── lost+found
├── media
├── mnt
├── opt
├── proc
├── root
├── run
├── sbin
├── srv
├── sys
├── tmp
├── usr
└── var

To import a base image we simply do:

 $ lxc image import base.tar.gz --alias=base

Or if you don't have a base image at hand, you can download one:

$ lxd-images import ubuntu --alias=base

Creating a container

Well that's super easy!

$ lxc launch base my-container

And we're in!

$ lxc exec my-container bash

You should have networking connectivity now and be able to install packages using apt-get. You can set up users, and add ssh keys or whatever. But of course, we want to automate this. This is where Ansible gets into play. But before we come to that, we need to do some network configuration.

Networking

Make sure that both dnsmasq and lxc-net services are running:

# service dnsmasq restart
# service lxc-net restart

Now edit /etc/default/lxc-net.

Make sure that the following line is uncommented. Then the lxc-net daemon will automatically created a bridged network for your containers

USE\_LXC\_BRIDGE="true"  

Next in the file is the configuration of the private network for your containers. You can leave them as is or change the network. We decided to use the 192.168.2.0/24 subnet for our containers, which is the following config:

LXC_BRIDGE="lxcbr0"  
LXC_ADDR="192.168.2.1"  
LXC_NETMASK="255.255.255.0"  
LXC_NETWORK="192.168.2.0/24"  
LXC\_DHCP\_RANGE="192.168.2.2,192.168.2.254"  
LXC\_DHCP\_MAX="253"  

Furthermore make sure that LXC_DOMAIN="lxc" is uncommented. It signals dnsmasq to assign containers <containername>.lxc domain names.

To actually make sure that dnsmasq does this, we'll have to edit the dnsmasq config in /etc/dnsmasq.d/lxc. Set the server to whatever you set in LXC_ADDR. In our case 192.168.2.1

bind-interfaces  
except-interface=lxcbr0  
server=/lxc/192.168.2.1  

Also make sure that whenever you edit dnsmasq or lxc-net configs that you restart the services to register the changes.

$ service lxc-net restart
$ service dnsmasq restart

Now if we restart our container, we should be able to connect to it!

$ lxc restart my-container
$ ping my-container.lxc

Also, if you installed ssh on the container with apt-get and added your ssh key to a user, you should be able to ssh into it aswell. (Which is pre-installed on the ubuntu base image, and ssh keys of all our developers are in the base image already)

$ ssh dev@my-container.lxc

Nitpick with launching new containers

Currently, there is a little bug in lxd, that causes containers not to register with dnsmasq on first launch. So if you launch a new container, make sure to restart it immediatelly to make it register a dns name.

$ lxc launch base new-container && lxc restart new-container

Provisioning with Ansible

Ansible provisioning is reall easy now. Create a container for each server your want to run in your development environment:

$ lxc launch base frontend && lxc restart frontend
$ lxc launch base postgres && lxc restart postgres
$ lxc launch base workers  && lxc restart workers

And make a new inventory file, for example named dev:

[frontend]
frontend.lxc  
[postgres]
postgres.lxc  
[workers]
workers.lxc  

Now simply run your ansible playbook:

$ ansible-playbook --ask-sudo-pass ./provision.yml -i ./inventory/dev -e development=true

Your containers should be provisioned now!

Staging server

At work we also use this technique to run our staging server. We have a staging server running at staging.internal which has ansible and lxc installed. If we log into it with SSH Agent Forwarding. The base image has the public keys of our development machines, so with agent forwarding, we can provision the servers from the staging server.

ssh -A dev@staging.internal  

Once we're in, we can simply start new containers and provision them with ansible as shown above.

Troubleshooting

Sometimes lxd can be a bit grumpy (it's not fully stable yet). It might not always succeed in claiming a domain name. In that case I usually first try to restart the container lxc restart containername and if that doesn't work I restart both dnsmasq and lxc-net just to be sure.