Operations is more than just Systems Administration

I think one of the patterns of the last few years has been the democratization of systems administration, especially for web applications. Whether that’s Heroku or Docker, or Chef or Puppet, more and more traditional developers are doing work that would have been somebody else’s problem only a few years ago. But running in parallel to that thread is another less positive trend, that of conflating operations with just systems administation. The story seems to go that now we know Ansible (or some other tool) we just need developers to run the show.

In this post I’m going to try and introduce some of the other operational disciplines, especially for developers who maybe have come to operations via the above resurgence in infrastructure tooling over the past few years.

Note that this post has a slight bias towards more normal organisations. That is to say if you’re in a 5 person software startup you probably don’t have operational problems to worry too much about yet. I’m also not playing down the practice of systems administration, most experienced sysadmins I know are also quite rounded operations pros as well.

Service Management

If you’ve worked in operations, or in many large organisations you’ll have come across the term Service Management. This tends to be linked to various service management frameworks; like ITIL or MOF (Microsoft Operations Framework). The framework will describe, often in great detail, activities and processes for things like incident response, configuration management, change management, capacity planning and more.

While I was at The Government I wrote what I think is a reasonable introduction to Service Management albeit from a specific point-of-view. This was based on my experience of trying, and likely sometimes failing, to encourage teams to think about how the products they we’re working on would be run. Each of the topics touched on in the overview is worthy of it’s own stack of books, but I will repeat the ITIL service list here as (whatever you might think of the framework or a specific implementation) I’d found it a useful starting point for conversations - in particular stressing the breadth of topics under service management.

Service Strategy

  • IT service management
  • Service portfolio management
  • Financial management for IT services
  • Demand management
  • Business relationship management

Service Design

  • Design coordination
  • Service Catalogue management
  • Service level management
  • Availability management
  • Capacity Management
  • IT service continuity management
  • Information security management system
  • Supplier management

Service Transition

  • Transition planning and support
  • Change management
  • Service asset and configuration management
  • Release and deployment management
  • Service validation and testing
  • Change evaluation
  • Knowledge management

Service Operation

  • Event management
  • Incident management
  • Request fulfillment
  • Problem management
  • Identity management
  • Continual Service Improvement

For each of the above points, whether you are using ITIL or not, it’s useful to have a conversation. Some of these areas do provide ample opportunity for automation and for using tooling to minimise the effort required. But much of this is about designing how you are going to operate a service throughout it’s lifetime.

Operations user stories

One of the other things I published while at The Government was a set of user stories for a web operations team. These grew out of work on launching GOV.UK and have had input from various past colleagues. In hindsight I’d probably do somethings here differently, the stories assume a certain context which isn’t explicitly spelled out for instance. But they have a couple of things going for them in that they demonstrate how traditional operations activities can be planned out as part of a more developer-friendly planning approach, and also they are public and have been tested by more than a single team.

Not everything is a programming problem

The main point I think is that not everything can be turned into a programming problem to solve. Automation has it’s place, and many manual processes and practices can benefit from automation. But the wide range of activities involved in running a non-trivial and often non-ideal system in production tend to mean making trade-offs and prioritization decisions frequently. This is where softer skills like arguing for funding or additional head count, or building a business case for further work, come into play. Operations management is much more than systems administration.

Further reading

This is little more than a plea for people to think more about operations, separate to the more technical aspects of systems administration. If you’re interested in learning more however I would recommend some good reading material:

  • Visible Ops Handbook - still an excellent and pragmatic introduction to many of the topics noted above.
  • Designig Delivery - a bang up-to-date tome covering a range of service design topics.
  • Basic Service Management - a 50 page starter book covering the fundamentals of service management as generally discussed in more detail elsewhere. A great starting point.

Provisioning droplets with Puppet

I love DigitalOcean for quickly spinning up machines. I also like managing my infrastructure using Puppet. Enter the garethr-digitalocean module. This currently provides a single Puppet type; droplet.

Lets show a quick example of that, by launching two droplets, called test-digitalocean and test-digitalocean-1.

droplet { ['test-digitalocean', 'test-digitalocean-1']:
  ensure => present,
  region => 'lon1',
  size   => '512mb',
  image  => 14169855,
}

With the above manifest saved as droplets.pp we can run it with:

$ puppet apply --test droplets,pp

This will ensure those two droplets exist in that region, and have that size. If they don’t exist it will launch droplets using the specified image. This means we can run the same command again, and rather that create more instances it will simply report that we currently have those droplets already.

Querying resources

Puppet also comes with puppet resource, a handy way of querying the state of a given resource or type. Running the following will list all of your droplets, whether you created them using Puppet or not.

$ puppet resource droplet
droplet { 'test-digitalocean':
  ensure              => 'present',
  backups             => 'false',
  image               => '14169855',
  image_slug          => 'ubuntu-15-10-x64',
  ipv6                => 'true',
  price_monthly       => '10.0',
  private_address     => '10.131.98.186',
  private_networking  => 'true',
  public_address      => '178.62.25.100',
  public_address_ipv6 => '2A03:B0C0:0001:00D0:0000:0000:0090:B001',
  region              => 'lon1',
  size                => '1gb',
}

Mutating resources

The type also supports mutating droplets, for instance changing the size of a droplet if you change the model in Puppet. The API client doesn’t support all possible changes, but you can disable backups, enable IPv6 and switch on private networking as needed. Here’s a quick sample of the output showing this in action.

Info: Loading facts
Notice: Compiled catalog for gareths-macbook.local in environment production in 0.43 seconds
Info: Applying configuration version '1449225401'
Info: Checking if droplet test-digitalocean exists
Info: Powering off droplet test-digitalocean
Info: Resizing droplet test-digitalocean
Info: Powering up droplet test-digitalocean
Notice: /Stage[main]/Main/Droplet[test-digitalocean]/size: size changed '1gb' to '512mb'
Error: Disabling IPv6 for test-digitalocean is not supported
Error: /Stage[main]/Main/Droplet[test-digitalocean]/ipv6: change from true to false failed: Disabling IPv6 for test-digitalocean is not supported
Error: Disabling private networking for test-digitalocean is not supported
Error: /Stage[main]/Main/Droplet[test-digitalocean]/private_networking: change from true to false failed: Disabling private networking for test-digitalocean is not supported
Info: Checking if droplet test-digitalocean-1 exists
Info: Created new droplet called test-digitalocean-1
Notice: /Stage[main]/Main/Droplet[test-digitalocean-1]/ensure: created
Info: Class[Main]: Unscheduling all events on Class[Main]
Notice: Applied catalog in 60.61 seconds

But why?

Describing your infrastructure at this level in code has several advantages:

  • Having a shared model of your infrastructure in code allows for a discussion around that model
  • You can be convident in the model because of the idempotent nature of running the code
  • The use of code for this model allows for activities like code review, change control based on pull requests, unit testing, user created abstrations and more
  • The use of Puppet means you can use it as above as a command line interface, or run it every period of time to enfore and report on the state of you infrastructure
  • Puppet ecosystem tools like PuppetDB, Puppet Board or Puppet Enterprise mean you can store data over time for later analysis

The module also acts as a reasonable example of a simple Puppet type and provider. If you’re interested in extending Puppet for your own services this is hopefully a good place to start understanding the API.

Some Security Implication of Unikernels

I was attending the first GOTO London conference last week, in particlar the Rugged Track. One of the topics of conversation that came up was unikernels, and their potential for improving the state of software security. Unikernels are pretty new outside research groups, I’m just lucky enough to live and work in Cambridge where some of that research is happening. The security advantages of unikernels are one of the things that attracted me in the first place. I thought it might be interesting to jot a few of those down for other people interested in security and the future of infrastructure.

As with my last post, it’s worth having a basic understand of Unikernels. I’d recommend reading Unikernels - the rise of the virtual library operating system.

Hypervisor

Every unikernel is provided the isolation guarantees from a hypervisor. Not only are these guarantees reasonably well understood, they tend to make use of hardware features too. It’s interesting to note that recent container runtime work is heading in this direction too, with ptojects like Clear Containers from Intel, Bonneville from VMware and the new stage1 in rkt.

No User Space

With a typical server OS we have kernel space and user space. Part of the idea here is to ensure the underlying machine doesn’t crash, whatever horrible things people do in user space. But this means you can do horrible things. The unikernel model is similar to the Erlang philosophy of let it crash. You only have kernel space, you entire application resides in it. Most things out of the ordinary are going to crash the kernel. This makes the sort of exploratory testing useful in exploit development harder.

Really Immutable Infrastructure

People often talk about immutable infrastructure. I’d wager there is more talk than reality however. When you push, people are often not using read-only file systems and retain the capability to login to machines to make ad-hoc changes. What they mean by immutable is that they only change machines at deploy time. This ignores both the fact they have the technical capability to change them anytime, and that an attacker could change them outside that deployment cycle. With unikernel systems there is often just the compiled kernel, you can’t just change files on disk. The defaults force an immutable way of working.

Clean Slate TLS

As a typical developer or operator you’ve probably learned more than you wanted to know about the OpenSSL source code. It’s not well understood and not likely to be so anytime soon and has some pretty spectacular bugs like Heartbleed. The Core Infrastructure Initiative is laudable and will improve things but it’s still a problematic codebase. Functional programming is often regarded as an easier way of writing understandable code. Types are a good thing, especially when it comes to security systems. So a pure OCaml TLS implementation as used by MirageOS makes sense on lots of levels. Yes this is quite an undertaking, but the bitcoin pinata tests show promise.

Formal Proofs

Knowing whether an application really does exactly what you want it to do (and no more) is a hard problem to solve. Unit tests and other form of automated testing help, but are still reliant on people to both write and design the tests. A formal proof system can provide much stronger guarentees of correctness, it’s an approach used in some cases for missing-critical components of Amazon’s AWS. MirageOS is implemented in OCaml. One of the most popular OCaml programmes is Coq, which just so happens to be a formal proof management system. I’ve not seen many examples yet of this approach, probably due to the effort involved, but the capability is there for building formally specified unikernels. I’d wager a similar thing is possible with Haskell and HalVM. Making that easier to do for typical developers could open up much more secure development practices for certain usecases.

A Discussion of The Operational Challenges With Unikernels

What are Unikernels

Most of this post assumes a basic understanding of what unikernels are so I’d recommend reading Unikernels – the rise of the virtual library operating system before moving on.

Why are Unikernels interesting

As a starting point: complexity. Managing infrastructure, and the software that runs on it, is too complicated. You can impose organisational rules to control this complexity (we only deploy on Debian, we only run JVM applications, the only allowed database is MySQL) but that limits you in other ways too, and in reality is nearly always broken somewhere in any non-trivial environment (this appliance uses Ubuntu, this software is only certified on Windows, PostgreSQL doesn’t run on the JVM). So you turn to software to manage that complexity; Puppet or Chef do a great job of allowing configuration complexity to be managed in code (where you can test it) and Docker allows for bundles of complexity to be isolated from other bundles of complexity. But there are still an awful lot of moving parts.

Another reason is the growing realisation that security is important. Securing systems on the internet is hard. Even though the basics are broadly understood they are often not implemented, and the people attempting to compromise systems are smart, well paid and highly incentivised (basically like you). It’s generally easier to break something than to build it. Part of this is a numbers game – to run a reasonable sixed system you might need to run 50 different services, and install 200 packages on every host. An attacker has to compromise just one of those to win.

A further reason, if one were needed, is the proliferation of many small internet connected devices, aka. The Internet of Things. Part of this relates to the above points about security concerns, but some of it is simply a matter of managing that many single purpose, low power, devices. The overhead of a typical general purpose operating system and application runtime just don’t fit this model.

Enter unikernels. Unikernels actually remove unneeded complexity. You’re running a hypervisor and the unikernel and that’s it. The unikernel contains only those libraries that you have specifically required. That drastically reduces the surface area for attack as well as meaning you’re running less software, hopefully enough less that your power needs are reduced too. By specifically requiring individual libraries you’re also making complexity visible. Rather than using a general purpose operating system with it’s 100s of packages and millions of lines of code you are at least choosing what to include.

Operational challenges

While I think some part of the future looks like unikernels their are some large operational challenges to overcome before they break out of very specific niches or research projects. Note that

there are architectural and software development challenges as well, I just happen to think they’re easier to deal with.

Development environment

There are a few properties of a development environment that I think are essential to modern development; development/production parity being one of the most important. Tools like Vagrant, and a move towards infrastructure as code, and more recently Docker have made great strides here in the past several years. The different unikernel implementations are generally based on lesser known software stacks (Haskell, OCaml, Erlang, etc) so some of this is familiarity. But what does development/production partity mean for a unikernel based system? We’re not just talking about the individual unikernel here either – how do I deploy unikernels? How do I compose several unikernels together to build an application? What does a Continuous integration or deployment pipeline look like? In my view the unikernel movement should focus some efforts here. Not only will this make it easier for people to get started, but having strong opinions early will allow the nascent community to solve the problem together, rather than everyone solving it just-in-time for themselves.

Managing the hypervisor

I’d argue today most developers don’t spent much time directly working with hypervisors. Either you’re running on an in-house VMware, KVM or Xen install with some (hopefully self-service, automated) provisioning mechanism in place or you’re using a public cloud like AWS, Azure, etc. The current generation of unikernel systems mainly target Xen. I think in the short term at least this means getting to know the hypervisor. Xen is solid software, but I don’t see a great deal of automation around it – say well maintained Puppet modules, API clients or a Terraform provider. In the long term we’ll hopefully have higher level interfaces, but in the short term efforts here would lower the barrier to entry considerably.

Double down on AWS

Given the above, and given the ubiquity of EC2 (which is based on Xen) it might be wise to build up first-class tools around using EC2 as a target environment for unikernel deployments. EC2 supports custom kernels, but these require a number of convoluted steps that could be automated away (note that I’m talking about more than just a shell script here). Also what are the best practices around autoscaling groups andunikernels? Or VPC networks and unikernels?

The network

With the explosion in containers and microservices it’s becoming clearer (if it wasn’t already) how important the network is. By removing the operating system we remove things like host firewalls and the new breed of overlay networks. At the same time if we are to tap the dynamic potential of unikernels we’ll need a similarly dynamic and automatable network. Maybe this becomes more of an application concern, with services communicating via other services which act as firewalls and intelligent proxies, but that still leaves the underlying network to be managed.

Debugging

However much testing you do beforehand you’ll still likely end up with problems in production, and as you scale up you’ll hit issues that you simply can’t recreate outside the live environment. This is were good debugging capabilities come in. While general purpose operating systems might be complex they are well know, and tools like ps, top, free, ping, telnet, netcat, dtrace, etc. are commonly used by anyone debugging systems. Note that in many cases you’re debugging a combination of systems; is the performance issue an application problem, a network problem, a storage problem or some interesting combination of several facters?

By removing the general purpose operating system, unikernel based environments remove most of the current debugging tools at the same time. Part of this Is good application development hygiene (logs, metrics and status endpoints for instance), but what about the more interactive debugging practices? What does debugging a system based on unikernels look like?

Orchestration

The word may be overloaded but the need to arrange and manage a number of components that make up a larger system is a real need. This might be something like Docker’s Compose file or Brooklyn’s Blueprints, or it could be something more akin to the APIs from Cloud Foundry, Kubernetes or Mesos. Testing some of these models with unikernel based systems will be an interesting test of how coupled to containers the existing models are. The lack of legacy again opens up the potential to come up with a truly modern alternative here too.

Conclusion

Unless you’re in an environment where security is your number 1 concern then the current state of Unikernels probably means choosing to adopt them now is a little bleeding edge. But I think that will change over time as the various projects mature and address some of the issues described above. In the meantime I’d love to see more discussion of some of the operational challenges. I think talking about the needs of operators at this early stage should make the resulting ecosystems more robust whsen it comes to future production deployments.

Update to Puppet Module Skeleton

Being on holiday last week meant I had a little time for some gardening of open source projects and I decided to update puppet-module-skeleton with some new opinions.

The skeleton is a replacement for the default module skeleton that ships with Puppet and is used by puppet module generate. Unlike the default skeleton this one is super-opinionated. It comes bundled with lots of testing tools, suggestions for documentation, integration with Travis CI, module coverage reports and more.

Updates in the latest version include:

  • Support for Puppet 4 paths
  • The addition of Rubocop, which enforces parts of the Ruby style guide
  • Adding a number of Puppet Lint plugins
  • Allow installing various Puppet versions during integration tests

I also fixed a few reported bugs and extended the test matrix to test across a range of Puppet and Ruby combinations.

The skeleton is intended to help people with a basic understanding of Puppet write better modules, without having to setup everything themselves. You don’t have to agree with all the options to make use of the skeleton as it’s simple enough to delete a few files once you generate your new module. But a working out-of-the-box beaker install, and the ability to automatically run unit tests when files change are patterns worth adopting for most module developers I think.

If anyone has any suggestions for extra tools, or changes to the skeleton itself, let me know.

Information Security Reading List

I read quite a bit (probably a book a week or so) and one of the topics I’ve been reading on for a while is information security. In a recent conversation someone asked for some book suggestions, so I thought I’d write that up in a blog post rather than an email.

Most of this list isn’t particularly technical. It’s not a developers list of software engineering tomes. If you’re a developer or operator then I’d recommend reading some of the more policy or journalistic pieces as well for context. And if you’re just interested in the topic but nor particularly technical I’d skip the security engineering suggestions.

Note that I make no claims about this being a particularly balanced list, it’s biased towards what I find interesting to read. Hopefully you’ll find it interesting too.

Journalism

Understanding why Information Security is important tends to require some context. The following books provide that, with detailed real-world stories of criminal and government activities.

  • The Dark Net - Jamie Bartlett - an excellent personal tale of investigating the hidden side of the internet.
  • Spam Nation - Brian Krebs - everything you wanted to know about how and why Spam works.
  • Countdown to Zero Day - Kim Zetter - a detailed and fast paced description of the Stuxnet attack, and it’s implications.
  • Future Crimes - Marc Goodman - a focus on the criminal possibilities of the modern internet and the internet of things.
  • Worm - Mark Bowden - similar to the excellent tale of Stuxnet above, this is the story of Conficker and how it was discovered

Policy and context

These books are focused more on government policy and nation state threats, and the debate about the rules of war and the internet.

  • Cyber War - Richard Clarke - probably the best description of what cyber war is and isn’t, and some of the geopolitical problems emerging.
  • Cyber War Will Not Take Place - Thomas Rid - a good counter to the above book, with lots more detailed discussion of policy and definition.
  • Inside Cyber Warfare - Jeffrey Carr - really just a run through of current threats, especially organised crime.

Security engineering

  • Security Engineering - Ross Anderson - highly technical and quite epic, but definitely the best security engineering book around.
  • Threat Modelling - Adam Shostack - details descriptions of how and why to conduct threat moddelling, with lots of examples.
  • Data Driven Security - Jay Jacobs and Bob Rudis - nice examples, including code samples, of applying data and statistics tools and practices to security problems.
  • Cloud Security and Privacy - Tim Mather, Subra Kumaraswamy, Shahed Latif - a good book to read for anyone working in AWS, Azure or similar. Good discussion of concerns and compliance approaches in third party environments.
  • The Tangled Web - Michal Zalewski - everything you ever wanted to know about the browser security model
  • Silence on the Wire - Michal Zalewski - described as a field guide to passive reconnaissance and indirect attacks. Good for starting to think about non-obvious security threats

On my reading list

I’ve not read these books yet so can’t recommend them as such, but they both look good additions to the list above.

  • Data and Goliath - Bruce Schneier - a look at the large scale data collection programmes of governments and their implications for everyone.
  • Black Code - Ronald J. Deibert - the story of the Citizen Lab and it’s front line cyber researchers

Acceptance testing MirageOS installs

I’m pretty interested in MirageOS at the moment. Partly because I find the idea behind unikernels interesting and partly because I keep bumping into the nice folks OCaml Labs in Cambridge.

In order to write and build your MirageOS unikernel application you need an OCaml development environment. Although this is documented I wanted something a little more repeatable. I also found and reported a few bugs in the documentation which got me thinking about acceptance testing. I’m not (yet) an OCaml programmer, but infrastructure automation and testing I can do.

Into Puppet

I started out writing a Puppet module to install and manage everything, which is now available on GitHub and on the Forge.

This lets you do something like the following, and have a fully working MirageOS setup on Ubuntu 12.04 or 14.04.

class { 'mirageos':
  user      => 'vagrant',
  opam_root => '/home/vagrant/.opam',
}

Given time, inclination or pull requests I’ll add support for other operating systems in the future.

But how do you know it works?

The module has a small unit test suite, but it’s nice to know test the actual running of Puppet and installation of the software. For this I’ve used Test Kitchen and ServerSpec. This allows for spinning up 2 virtual machines (one for each supported operating system), applying the Puppet manifest and then making some assertions:

The above is simply checking whether certain packages are installed, the PPA is setup correctly and whether mirage and opam can be executed cleanly.

Can it produce a working unikernel?

The above tells us whether the installation worked, but not whether the resulting software allows us to build MirageOS unikernels. For this I used Bats running in the same Test Kitchen setup.

The above configures and builds a simple HTTP server unikernel, and then checks that when run it returns the expected response on the correct port.

Conclusion

I like the separation of concerns above. I can use the Puppet code without the test code, or even swap the Puppet code out for a shell script if I wanted. I could also run the serverspec tests anywhere I want to check state, which is the reason for separating those tests from the one’s building and running a unikernel. Overall the tool chain for ad-hoc infrastructure testing (quick mention of Infrataster too) is really quite powerful and approachable. I’d love to see more software ship with a user-facing test suite for people to verify their installation works.

Automating windows development environments

My job at Puppet Labs has given me an excuse to take a closer look at the advancements in Windows automation, in particular Chocolatey and BoxStarter. The following is very much a work in progress but it’s hopefully useful for a few things:

  • If like me you’ve mainly been doing non-Windows development for a while it’s interesting to see what is possible
  • If you’re starting out with infrastructure development on Windows the following could be a good starting place
  • if you’re an experienced Windows pro then you can let me know of any improvements

All that’s needed is to run the following from a CMD or Powershell prompt on a new Windows machine (you can also visit the URL in Internet Explorer if you prefer).

START http://boxstarter.org/package/nr/url?https://gist.githubusercontent.com/garethr/a1838aa68355a0766de4/raw/d92b41ee9dcad68c079d24c64bac7d1d27cf37c7/garethr.ps1

This launches BoxStarter, which executes the following code:

This takes a while as it runs Windows update and repeatedly reboots the machine. But once completed you’ll have the listed software installed and configured on a newly up-to-date Windows machine.

Docker, Puppet and shared volumes

During one of the openspace sessions at Devopsdays we talked about docker and configuration management, and one of the things we touched on was using dockers shared volumes support. This is easier to explain with an example.

First, lets create a docker image to run puppet. I’m also installing r10k for managing third party modules.

Docker

FROM ubuntu:trusty

RUN apt-get update -q
RUN apt-get install -qy wget
RUN wget http://apt.puppetlabs.com/puppetlabs-release-trusty.deb
RUN dpkg -i puppetlabs-release-trusty.deb
RUN apt-get update

RUN apt-get install -y puppet ruby1.9.3 build-essential git-core
RUN echo "gem: --no-ri --no-rdoc" > ~/.gemrc
RUN gem install r10k

Lets build that and tag it locally. Feel free to use whatever name you like here.

docker build -t garethr/puppet .

Lets now use that image as a base for another image.

FROM garethr/puppet

RUN mkdir /etc/shared
ADD Puppetfile /
RUN r10k puppetfile check
RUN r10k puppetfile install
ADD init.pp /
CMD ["puppet", "apply", "--modulepath=/modules", "/init.pp","--verbose", "--show_diff"]

This image will be used to create containers that we intend to run. Here we’re including a Puppetfile (a list of module dependencies) and then running r10k to download those dependencies. Finally we add a simple puppetfile (this would likely be an entire manifests directory in most cases). The final line means that when we run a container based on this image it will run puppet and then exit.

Again lets build the image and tag it.

docker build -t garethr/puppetshared .

Just as a demo, here’s a sample Puppetfile which includes the puppetlabs stdlib module.

Puppet

mod 'puppetlabs/stdlib'

And again as an example here’s a simple puppet init.pp file. All we’re doing is creating a file at a specific location.

file { '/etc/shared/client':
  ensure => directory,
}

file { '/etc/shared/client/apache.conf':
  ensure  => present,
  content => "not a real config file",
}

Fig

Fig is a tool to declare container types in a text file, and then run and manage them from a simple CLI. We could do all this with straigh docker calls too.

master:
  image: garethr/puppetshared
  volumes:
    - /etc/shared:/etc/shared:rw

client:
  image: ubuntu:trusty
  volumes:
    - /etc/shared/client:/etc/:ro
  command: '/bin/sh -c "while true; do echo hello world; sleep 1; done"'

The important part of the above is the volumes lines. What we’re doing here is:

  • Sharing the /etc/shared directory on the host with the container called master. The container will be able to write to the host filesystem.
  • Sharing a subdirectory of of /etc/shared with the client container. The client can only read this information.

Note the client container here isn’t running Puppet. Here it’s just running sleep in a loop to simulate a long running process like your custom application.

Let’s run the master. Note that this will run puppet and then exit. But with the above manifest it will create a config file on the host.

fig run master

Then run the client. This won’t exit and should just print hello world to stdout.

fig run client

Docker 1.3 adds the handy exec command, which allows for one-off commands to be executed within a running container. Lets use that to see our new config file.

docker exec puppetshared_client_run_1 cat /etc/apache.conf

This should output the contents of the file we created by running the master container.

Why?

This is obviously a very simple example but I think it’s interesting for a few reasons.

  • We have completely separated our code (in the container) from the configuration
  • We get to use familiar tools for managing the configuration in a familiar way

It also raises a few problems:

  • The host needs to know what types of container are going to run on it, in order to have the correct configuration. If you’re using Puppet module then this is simple enough to solve.
  • The host ends up with all of the configuration for all the containers in one place, you could also do things with encrypting the data and having the relevant keys in one image and not others. Given how if you’re on the host you own the container anyway this isn’t as odd as it sounds.
  • We’re just demonstrating files here, but if we change our manifest and rerun the puppet container then we change the config files. But depending on the application it won’t pick that up unless we restart it or create a new container.

Given enough time I may try build a reference implementation using this approach, anyone with ideas about that let me know.

This post was inspired by a conversation with Kelsey and John, thanks guys.

Using Puppet with key/value config stores

I like the central idea behind storing configuration in something like Etcd rather than lots of files on lots of disks, but a few challenges still remain. Things that spring to mind are:

  • Are all your passwords now available to all of your nodes?
  • How do I know when configuration changed and who changed it?

I’ll leave the first of those for today (although have a look at Conjur as one approach to this). For the second, I’m quite fond of plain text, pull requests and a well tested deployment pipeline. Before Etcd (or Consul or similar) you would probably have values in Hiera or Data Bags or similar and inject them into files on hosts using your configuration management tool of choice. So lets just do the same with our new-fangled distributed configuration store.

key_value_config { '/foo':
  ensure   => present,
  provider => etcd,
  value    => 'bar',
}

Say you wanted to switch over to using Consul instead? Just switch the provider.

key_value_config { '/foo':
  ensure   => present,
  provider => consul,
  value    => 'bar',
}

You’d probably move all of that out into something like hiera, and then generate the above resources, but you get the idea.

etcd_values:
  foo: bar

The above is implemented in a very simple proof of concept Puppet module. Anyone with any feedback please do let me know.