Docker, Puppet and shared volumes

During one of the openspace sessions at Devopsdays we talked about docker and configuration management, and one of the things we touched on was using dockers shared volumes support. This is easier to explain with an example.

First, lets create a docker image to run puppet. I'm also installing r10k for managing third party modules.

Docker

FROM ubuntu:trusty

RUN apt-get update -q
RUN apt-get install -qy wget
RUN wget http://apt.puppetlabs.com/puppetlabs-release-trusty.deb
RUN dpkg -i puppetlabs-release-trusty.deb
RUN apt-get update

RUN apt-get install -y puppet ruby1.9.3 build-essential git-core
RUN echo "gem: --no-ri --no-rdoc" > ~/.gemrc
RUN gem install r10k

Lets build that and tag it locally. Feel free to use whatever name you like here.

docker build -t garethr/puppet .

Lets now use that image as a base for another image.

FROM garethr/puppet

RUN mkdir /etc/shared
ADD Puppetfile /
RUN r10k puppetfile check
RUN r10k puppetfile install
ADD init.pp /
CMD ["puppet", "apply", "--modulepath=/modules", "/init.pp","--verbose", "--show_diff"]

This image will be used to create containers that we intend to run. Here we're including a Puppetfile (a list of module dependencies) and then running r10k to download those dependencies. Finally we add a simple puppetfile (this would likely be an entire manifests directory in most cases). The final line means that when we run a container based on this image it will run puppet and then exit.

Again lets build the image and tag it.

docker build -t garethr/puppetshared .

Just as a demo, here's a sample Puppetfile which includes the puppetlabs stdlib module.

Puppet

mod 'puppetlabs/stdlib'

And again as an example here's a simple puppet init.pp file. All we're doing is creating a file at a specific location.

file { '/etc/shared/client':
  ensure => directory,
}

file { '/etc/shared/client/apache.conf':
  ensure  => present,
  content => "not a real config file",
}

Fig

Fig is a tool to declare container types in a text file, and then run and manage them from a simple CLI. We could do all this with straigh docker calls too.

master:
  image: garethr/puppetshared
  volumes:
    - /etc/shared:/etc/shared:rw

client:
  image: ubuntu:trusty
  volumes:
    - /etc/shared/client:/etc/:ro
  command: '/bin/sh -c "while true; do echo hello world; sleep 1; done"'

The important part of the above is the volumes lines. What we're doing here is:

  • Sharing the /etc/shared directory on the host with the container called master. The container will be able to write to the host filesystem.
  • Sharing a subdirectory of of /etc/shared with the client container. The client can only read this information.

Note the client container here isn't running Puppet. Here it's just running sleep in a loop to simulate a long running process like your custom application.

Let's run the master. Note that this will run puppet and then exit. But with the above manifest it will create a config file on the host.

fig run master

Then run the client. This won't exit and should just print hello world to stdout.

fig run client

Docker 1.3 adds the handy exec command, which allows for one-off commands to be executed within a running container. Lets use that to see our new config file.

docker exec puppetshared_client_run_1 cat /etc/apache.conf

This should output the contents of the file we created by running the master container.

Why?

This is obviously a very simple example but I think it's interesting for a few reasons.

  • We have completely separated our code (in the container) from the configuration
  • We get to use familiar tools for managing the configuration in a familiar way

It also raises a few problems:

  • The host needs to know what types of container are going to run on it, in order to have the correct configuration. If you're using Puppet module then this is simple enough to solve.
  • The host ends up with all of the configuration for all the containers in one place, you could also do things with encrypting the data and having the relevant keys in one image and not others. Given how if you're on the host you own the container anyway this isn't as odd as it sounds.
  • We're just demonstrating files here, but if we change our manifest and rerun the puppet container then we change the config files. But depending on the application it won't pick that up unless we restart it or create a new container.

Given enough time I may try build a reference implementation using this approach, anyone with ideas about that let me know.

This post was inspired by a conversation with Kelsey and John, thanks guys.

Using Puppet with key/value config stores

I like the central idea behind storing configuration in something like Etcd rather than lots of files on lots of disks, but a few challenges still remain. Things that spring to mind are:

  • Are all your passwords now available to all of your nodes?
  • How do I know when configuration changed and who changed it?

I'll leave the first of those for today (although have a look at Conjur as one approach to this). For the second, I'm quite fond of plain text, pull requests and a well tested deployment pipeline. Before Etcd (or Consul or similar) you would probably have values in Hiera or Data Bags or similar and inject them into files on hosts using your configuration management tool of choice. So lets just do the same with our new-fangled distributed configuration store.

key_value_config { '/foo':
  ensure   => present,
  provider => etcd,
  value    => 'bar',
}

Say you wanted to switch over to using Consul instead? Just switch the provider.

key_value_config { '/foo':
  ensure   => present,
  provider => consul,
  value    => 'bar',
}

You'd probably move all of that out into something like hiera, and then generate the above resources, but you get the idea.

etcd_values:
  foo: bar

The above is implemented in a very simple proof of concept Puppet module. Anyone with any feedback please do let me know.

Leaving GDS never easy

The following is the email I sent to lots of my colleagues at the Government Digital Service last week.

So, after 3 rather exciting years I've decided to leave GDS.

That's surprisingly difficult to write if I'm honest.

I was part of the team that built and shipped the beta of GOV.UK. Since then I've worked across half of what has become GDS, equally helping and frustrating (then hopefully helping) lots of you. I've done a great deal and learnt even more, as well as collected arcane knowledge about government infosec and the dark arts of procurement along the way. I've done, and helped others do, work I wouldn't even have thought possible (or maybe likely is the right word?) when I started.

So why leave? For all the other things I do I'm basically a tool builder. So I'm going off to work for Puppet Labs to build infrastructure tools that don't suck. That sort of pretty specific work is something that should be done outside government in my opinion. I'm a pretty firm believer in "government should only do what only government can do" (design principle number 2 for those that haven't memorised them yet). And if I'm honest, focusing on something smaller than fixing a country's civic infrastructure is appealing for now.

I'll let you in on a secret; I didn't join what became GDS because of the GOV.UK project. I joined to work with friends I'd not yet had the chance to work with and to see the inside of a growing organisation from the start. I remember Tom Loosemore promising me we'd be 200 people in 3 years! As far as anyone knows we're 650+ people. That's about a person a day for 2 years. I'm absolutely not saying that came without a cost, but for me being part of that that was part of the point - so I can be a little accepting with hindsight.

For me, apart from all the short term things (side-note: this job now has me thinking £10million is a small amount of money and 2 years is a small amount of time) there is one big mission:

Make government a sustainable part of the UK tech community

That means in 10 years time experienced tech people from across the country, as well as people straight from university, choosing to work for government. Not just for some abstract and personal reason (though that's fine too), but because it's a a genuinely interesting place to work. That one's on us.

Using OWASP ZAP from the command line

I'm a big fan of OWASP ZAP or the Zed Attack Proxy. It's suprisingly user friendly and nicely pulls of it's aim of being useful to developers as well as more hardcore penetration testers.

One of the features I'm particularly fond of is the aforementioned proxy. Basically it can act as a transparent HTTP proxy, recording the traffic, and then analyse that to conduct various active security tests; looking for XSS issues or directory traversal vulnerabilities for instance. The simplest way of seeding the ZAP with something to analyse is using the simple inbuilt spider.

So far, so good. Unfortunately ZAP isn't designed to be used from the command line. It's either a thick client, or it's a proxy with a simple API. Enter Zapr.

Zapr is a pretty simple wrapper around the ZAP API (using the owasp_zap library under the hood). All it does is:

  • Launch the proxy in headless mode
  • Trigger the spider
  • Launch various attacks against the collected URLs
  • Print out the results

This is fairly limited, in that a spider isn't going to work particularly well for a mor interactive application, but it's a farily good starting point. I may add different seed methods in the future (or would happily accept pull requests). Usage wise it's as simple as:

zapr --summary http://localhost:3000/

That will print you out something like the following, assuming it finds an issue.

+-----------------------------------+----------+----------------------------------------+
| Alert                             | Risk     | URL                                    |
+-----------------------------------+----------+----------------------------------------+
| Cross Site Scripting (Reflected)  | High     |http://localhost:3000/forgot_password   |
+-----------------------------------+----------+----------------------------------------+

The above alert is taken from a simple example, using the RailsGoat vulnerable web application as a scape goat. You can see the resulting output from Travis running the tests.

Zapr is a bit of a proof of concept so it's not particularly robust or well tested. Depending on usage and interest I may tidy it up and extend it, or I may leave it as a useful experiment and try and finally get ZAP support into Gauntlt, only time will tell.

Consul, DNS and Dnsmasq

While at Craft I decided to have a quick look at Consul, a new service discovery framework with a few intersting features. One of the main selling points is a DNS interface with a nice API. The Introduction shows how to use this via the dig command line tool, but how do you use a custom internal DNS server without modifying all your applications? One answer to this question is Dnsmasq.

I'm not explaining Consul here, the above mentioned introduction does a good job of stepping through the setup. The following assumes you have installed and started consul.

Installation and configuration

I'm running these examples on an Ubuntu 14.04 machine, but dnsmasq should be available and packaged for lots of different operating systems.

apt-get install dnsmasq

Once installed we can create a very simple configuration.

echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul

All we're doing here is specifying that DNS requests for consul services are to be dealt with by the DNS server at 127.0.0.1 on port 8600. Unless you've changed the consul defaults this should work.

Just in case you prefer Puppet their is already a handy dnsmasq module. The resulting puppet code then looks like this.

include dnsmasq
dnsmasq::conf { 'consul':
  ensure  => present,
  content => 'server=/consul/127.0.0.1#8600',
}

Usage

The examples from the main documentation specify a custom DNS server for dig like so:

dig @127.0.0.1 -p 8600 web.service.consul

With Dnsmasq installed and configured as above you should just be able to do the following:

dig web.service.consul

And now any of your existing applications will be able to use your consul instance for service discovery via DNS.