A Continuous Deployment Example Setup

One of the reasons behind getting around to building Vagrantbox.es recently was I was giving a talk to a group of startups on The Difference Engine programme and I wanted to have an example project to demonstrate various things. I wanted to demonstrate everything from sensible version control habbits, configuration management, basic orcestration and most importantly a solid deployment process. I’ve decided to write up what I’m doing for deployment because I think it’s pretty nice, and for all the talk about Continuous Deployment I haven’t seen many examples of code and configuration to make it happen.

Most of what I’ll cover is pretty easy to map to whatever technologies your using. For this project I’d gone for Git, Django, Gunicorn, Nginx, Fabric, Mysql and Jenkins and I’m deploying to Ubuntu running on Brightbox Cloud. Apart from the Jenkins instance in the middle you could follow the instructions and swap things out easily.


First up lets install Jenkins. I setup a separate cloud instance just to run the Continuous Integration server. I find this approach easier to manage but you could always run this locally if you prefer. The Jenkins folk provide very up to date packages for Debian so I chose to use those.


Jenkins provides a huge number of optional plugins which enable various additional features. Plugins are installed via the web interface at /pluginManager. I’ve installed:

Only the Git plugin is really required for what I’m doing with deployment. Cobertura and Violations are code quality metrics tools that I use to record output from pylint and code coverage for my test suite.

The Source

My finished project was already on GitHub in a private repository. I’m using a requirements.txt file to record python dependencies so I can use pip to install them automatically and I’m using Virtualenv to sandbox this installation. I’m also using South to manage my database schema changes. I won’t go into that here as it’s pretty Python specific, Rails for instance has Active Record migrations, RVM and Bundler which do pretty much the same job. PHP has PEAR and some of the frameworks offer a migration tool.

I then created two projects in Jenkins:

Jenkins dashboard

Project 1: Vagrantboxes

This is the main build of my master branch in Git. As well as setting up the Git repo as shown below I’ve set a polling schedule to /5 * * * (that’s every 5 minutes) and also set Trigger builds remotely so I can have a task in my fabfile which triggers a build immediately.

Git config for Jenkins

I then have two build steps, both of which execute shell commands. The first installs any new requirements via pip:

bash -l -c "source bin/activate; pip install -r requirements.txt"

The second runs my test suite and generates the XML output required to show the test results in Jenkins:

bash -l -c "source bin/activate; cd vagrantboxes/configs/common; python manage.py jenkins boxes"

I’m using the rather handy Django Jenkins application for this.

So far so good. This gives us a project that, when we push some changes to GitHub, will pull those changes down to the CI server and run our test suite, giving us feedback as to whether the tests pass or fail.

Now for the trick, in Post-build Actions tick Build other projects and specify the name of another project that we’ll setup next. Mine is called Vagrantboxes-deploy.

Post build action in Jenkins

Project 2: Vagrantboxes-deploy

This project is triggered only when the previous project runs successfully. And all it’s going to do is run the deployment script on the project we just built. The setup for this project is very simply, it has one build step which just executes the following:

bash -l -c "cd /var/lib/jenkins/jobs/Vagrantboxes/workspace; source bin/activate; fab appserver deploy"

The specifics of the Fabric script here aren’t that important but I’m doing something not too disimilar to what I described here.

The reason I’ve setup a separate project for these is so I can, if I choose, trigger a deployment separately to the full build, and also so I can very easily disable deployments even if the main build is still running.


With this setup whenever I push code to master it triggers a build. If the test suite passes it runs the deployment script and pushes out the code to the live web servers. This suites me and this project but you might find it easier to start by pushing all successfull builds out to a staging environment. And maybe then moving on to having a new project which is only triggered manually for deploying to production.

project view in Jenkins

This setup has other advantages too. The Jenkins dashboard becomes a handy tool for recording deployment events. You can easily setup emails or IM messages or Campfire posts to alert other team members whenever a deployment happens. And it really really makes sure your delployment scripts work without hand holding.

This is a simple project that I’m working on on my own, but in a team environment you’d likely have a more complex branching strategy and more Jenkins projects. You might also introduce some gateways for manual testing but the starting point is the same. Jenkins makes archiving successful build artifacts relatively easy as well, this setup has a few race condition possibilities that you can fix by building artifacts from successful builds. Jenkins also supports building from different branches and having different branches trigger different projects, all handy if you want to grow this kind of setup.

Site For Vagrant Base Boxes

A brief conversation with Matt Keating on Twitter finally pushed me over the edge and I’ve built a site I’d been meaning to do for a while.

I’m a huge Vagrant fan, but one thing that often comes up is where to find base boxes. My newly launched site Vagranbox.es provides just that. At the moment that just means user submitted boxes being checked and then posted. I’ll likely add comments and ratings and the like if things become popular but that’s for later.

vargrantbox.es homepage

So, if you know of or host a useful box please let me know. I’ll try to keep up with any submissions.

Devops - More Than Marketing - Talk By James Turnbull

I’ve just found my notes from James Turnbull’s talk at FOSDEM. I found the talk excellent, and I’m already part of the choir. But much of the audience I’d guess have only come across the devops term in passing, or worse had it pushed at them as part of marketing materials. Hopefully I captured the main points:

So what is devops all about?

  • Cooperation (between development and operations teams)
  • Buzzword bingo?
  • Pop culture movement?
  • Discussion
  • It’s early days
  • No one has all the answers
  • Nothing is fixed in stone
  • It’s all about outreach

It’s about

  • Simplicity - Repeatable, Reusable, Easy to communicate
  • Relationships - Engage early, engage often, “Toss it over the fence”, Talk to people
  • Process - Test everything, Automate everything, Redundancy and expectation of failure, Transparent and open to everyone


  • Not just ops tools - Config mnagement, Deployment and orchestration, Monitoring, Security, Testing
  • Use for entire lifecycle dev -> test -> ops
  • Not just dev tools - Version control, Agile, Application architecture
  • Testing methodology - Low level vs functional
  • Documentation - “The only time the network diagram is up to date is after the post mortem”

Continuous improvement

  • Nothing stands still - Customers, Products, Technology, Your team
  • Strike often, striek hard, be aggressive

It’s a culture change

  • This is Hard
  • People hate change
  • People hate people who introduce change
  • Fear of change is irrational - Listen, Concrete examples
  • Make developers resonsible for uptime - Pagers


  • “We’ve always done this”
  • “That can’t work here”
  • “This is all about one group or another”
  • “You’re an elitist bunch of Europeans”


  • Marketing speak
  • Lip service
  • Disenchantment
  • Disenfranchisement


  • “Not about a person, or a team. About changing how your operations team works”
  • Automate away small boring repetative tasks to make time for interesting activities
  • Embed ops people into dev teams
  • Drag devs to ops standups
  • Build shared appreciation
  • Metrics conversations are really powerful

Configuration Management For Development Environments

I had the pleasure of speaking at Fosdem last weekend to a packed Configuration amd systems management devroom.

My presentation covered some of the same ground as recent blog posts, namely why you should be using virtualisation and config management tools to manage your local development environment.

People even said nice things about it:

@garethr basically has this subject completely covered. He’s even advocating the correct editor. excellent #fosdem talk

All in all another good event, I have notes about some of the other talks I went along to that I’ll try write up soon.

Using Checkinstall With Virtualenv For Python Deployments

Michael Brunton-Spall wrote last week about some frustrations with packagings and deploying Python web applications. Although his experience was with Python, the problems he describes are the same for Ruby and PHP and a whole host of languages. The following example uses Python, but works equally as well for anything else.

Michael has three simple rules for his servers:

  1. they cannot access the internet
  2. they cannot access internal services that are for development
  3. they cannot have compilers / utilities on them

I won’t go into all the reasons for doing this (you can read the blog post linked to above) but these are pretty sensible security precautions.

My approach to this problem would be to use your friendly system packages and using a handy tool called Checkinstall to create a deb or rpm. I’m going to use as an example the Eventlet library. This is available in PyPi and one of it’s dependencies (Greenlets) provides a C extension. The same approach would work for an entire Python web application too. I’m as ever using the apt package management tool but this should work with yum as well.

The first step is to build the package on a build machine. This should be a machine or virtual machine running the same operating system as your production web servers. You might build these packages manually or as part of a continuous integration system. On this machine we’ll need the compilers and development tools:

sudo apt-get install build-essential python-dev python-setuptools checkinstall
sudo easy_install virtualenv

We’ll also create a virtualenv into which we’ll be installing our packages:

sudo virtualenv --no-site-packages /usr/local/environment
source /usr/local/environment/bin/activate

Now, instead of just calling easy_install to install the package, we prefix it with checkinstall.

sudo checkinstall /usr/local/environment/bin/easy_install eventlet

This will prompt for various meta data about the package you want to create, including the name and version of the package. If you’re using this method in the real world you’ll want to decide on a versioning and naming scheme for your packages to avoid clashes with system provided packages. You can also set many of these options from the command line rather than having to manually fill them in each time.

Once everything has been filled in successfully this should run through, installing eventlet and greenlets and eventually creating a deb or rpm package depending on what platform you’re running on. You should see something like:

Done. The new package has been installed and saved to


 You can remove it from your system anytime using: 

      dpkg -r eventlet-gareth

Now lets grab that package and take it to one of our front end web servers via a controlled deployment process. That front end web server needs the virtualenv creating but nothing else. So:

sudo apt-get install python-virtualenv
sudo virtualenv --no-site-packages /usr/local/environment

(Now you might be thinking that installing the python-virtualenv package in this way breaks rule 1 above. And you’d be right in most cases, but I’m guessing Michael’s systems team have a local package repo for authorised packages, or alternatively you could download the package to the build machine and push it to the production environment.)

Now install the package we created earlier.

sudo dpkg -i eventlet-gareth_20110129-1_i386.deb

That should throw all the required files into the virtualenv environment we created. No compilers. No calls to internal or external systems. Just move some precompiled binaries and text files to predefined places on disk.

I used a PyPi package as an example. Checkinstall could have been pointed at a custom build file written especially for your own application, one that moves files and folders to where they are needed. Say something that looks like this:

cp /home/stage/myapplication /var/www/apps/

The running checkinstall against that (or a more complex build file using capistrano or ant or fabric) you can create a package containing your application code and install it into the specified place.

Why Developers Should Care About System Packages

First a bit of background. I’m a software developer (lately in Ruby and a tiny bit of Java, previously in Python, C# and PHP; yes I got around a bit), but have spent enough time looking after production hardware (mainly debian, solaris and recently a bit of RHEL) to have a feel for sysadmin work. I even have friends who are systems administrators. I mainly use a shiny apple laptop for my development work, but I actually execute all the code on Linux virtual machines. The aim of this post is to bridge a divide, not start a flame war about specific tools.

I’m writing this partly to address a tweet I made that in hindsight needed more than 140 characters. Actually a number of my recent tweets have been on the same theme so I should be more helpful. What I’m seeing recently is an increase in the ways I’m being asked to install software and for me at least that’s annoying.

  1. Several projects will ask you to do something like curl http://bit.ly/installsh | sh which downloads a shell script and executes it.
  2. Some will insist I have git installed
  3. A new framework might come with it’s own package manager

I’m a polyglot programmer (so I shouldn’t care about #3) that uses git for everything (scratch #2) and who writes little bash scripts to make my life easier (exactly like #1). So I understand exactly how and why these solutions appear fine. And for certain circumstances they are, in particular for local development on a machine owned and maintained by one person. But on a production machine and even on my clean and tidy virtual machines none of these cut it for me in most cases.

Most developers I know have only a passing awareness of packaging so I’m going to have an aside to introduce some cool tricks. I think this is one place where sysadmins go wrong, they assume developers understand their job and that they know the various tools intimately.

System Package Tips

I’m going to show examples using the debian tools so these apply to debian and ubuntu distros. RPM and the Yum tool have similar commands too, I just happen to know debs better.

List all installed packages

This one is a bit obvious, it’s probably going to be available in anyones home grown package management system. But if you’re installing software via hand using git or a shell script then you can’t even ask the machine what is installed.

dpkg -l

List files from package

I love this one. Have you ever installed a package and wondered where the config files are? You can soft of guess based on your understanding of the OS file system layout but this command is handy.

dpkg -L lynx

Where did that file come from?

Have a file on disk that you’re not sure where it came from? Ask the system package manager. The more everything is installed from packages the more useful this becomes.

dpkg -S /bin/netstat

Unmet dependencies

At the heart of a good package system is the ability to map dependencies and to have unmet dependencies installed as needed. Having tools to query that tree is useful in various places.

apt-cache unmet

Will give you output a little like the followning:

Package libdataobjects-sqlite3-ruby1.9.1 version has an unmet dep:
 Depends: libdataobjects-ruby1.9

What needs upgrading?

The apticron tool can alert you to packages that are now out of date. It’s easy to set it up to email you each day for each host and tell you about packages that need upgrading. Remember that the reason one of these might have an update could be a documented security bug and it becomes even more important to know about it quickly.

apticron report [Fri, 19 Jan 2007 18:42:01 -0800]

apticron has detected that some packages need upgrading on: 

    [ ]

The following packages are currently pending an upgrade:

    xfree86-common 4.3.0.dfsg.1-14sarge3
    libice6 4.3.0.dfsg.1-14sarge3
    libsm6 4.3.0.dfsg.1-14sarge3
    xlibs-data 4.3.0.dfsg.1-14sarge3
    libx11-6 4.3.0.dfsg.1-14sarge3
    libxext6 4.3.0.dfsg.1-14sarge3
    libxpm4 4.3.0.dfsg.1-14sarge3

I’m really not an expert on using debs but even I find these tools useful, and you don’t get the same capabilities when you use anything else.

Good and bad examples

Still here? Good. I’m going to pick on a few pieces of software to give examples of what I mean. All of this software I actively use and think is brilliant earth shattering stuff, I’m not dissing the software so if any fanboys reading can kindly not attack me please, I’m one of you.

RabbitMQ (Erlang)

The nice folk building the RabbitMQ message queue provide downloads of the source code as well as various system packages. Knowing that some people will want to use the latest and greatest version of the application they also host the latest deboan packages in their own package repo with details on their site.

Chef (Ruby)

The Chef configuration management system also provides multiple methods to install their software. For people already using, happy and familiar with it they provide everything as a ruby gem. If you prefer system packages they have those too. They also provide their own deb repo for people to grab the latest software.

Cloudera Hadoop (Java)

Before I found the Cloudera Hadoop packages I remember having great fun manually applying patches to get everything working. Cloudera do exactly the same as the above two developers, namely host their owns debs.


RVM is a fantastic way of managing multiple ruby versions and multiple isolated sets of gems. But it’s also probably the first place I saw the install from remote shell script approach.

bash < <( curl http://rvm.beginrescueend.com/releases/rvm-install-head )

I like to do the same things on my development machine as I do in production, and the main problem I have with RVM is that it’s so useful I want it everywhere. I’d prefer if the system wide install had some sort of option to install the rubies from packages rather than compile everything on the machine (meaning you need a full set of compile tools installed everywhere), or that we can automate the creation of the packages using rvm.


You’ll probably find packages for the Solr search server in recent distros. It’s hugely popular predominantly because it’s a fantasic piece of software. But everytime I have a look at the system packages I can’t quite get them to work, or they are out of date. I now know my way around Solr setup relatively well and just end up creating my own packages and I’ve spoken to other folk who have done the same. The Solr documentation recommends downloading a zip file to get started and I can’t see any mention of the packages. My guess is the packages aren’t maintained as part of the core development which is a quick way to get them out of sync with current progress.

Enough beating up on my fellow developers

System packages aren’t blameless, I think the culture often seen in debian of splitting the developer from the package maintainer is part of the problem. This manifests in various ways, all negative:

  • Out of date packages. The biggest complaint from developers about system packages is nearly always that they are out of date. Maintainers should more readily release packaging scripts (ideally back to the project) so people can easily roll their own.
  • The documentation around packaging is either fantastic or terrible, depending on what you want to do and who you are. It turns out making your own packages (using something like checkinstall) is actually quite easy.
  • The official debian docs I think focus on the role of package maintainer, rather than trying to push that downstream to the developers. That doesn’t make them bad, it just means we need documentation aimed at a developer just getting started with packaging their software.
  • Developers hosting their own package repository and asking people to point at that is also quite easy. The projects I praised above all do it nicely. But simple attractive documentation is hard to come by.

What to do

First up lets talk more about the distribution and installation of software. And lets do that in the spirit of making things better for everyone involved. The ongoing spat between Ruby and Debian people is just counterproductive. This would be a good article if it didn’t lead with:

This system (apt-get) is out-dated and leads to major headaches. Avoid it for Ruby-related packages. We do Ruby, we know what’s best. Trust us.

We need better documentation aimed at developers. I’m going to try and write some brief tutorials soon (otherwise I’d feel like this rant was just me complaining) but I’m not an expert. I’ll hapily help promote or collate good material as well. Maybe it already exists and I just can’t find it?

I’m a git user and a big GitHub fan, but one of the features of Launchpad I really like is the Personal Package Archive. This lets you upload source code and have it automatically built into a package. This is specific to Ubuntu but that’s understandable given Launchpad is also operated by Canonical. What I’d like is the same feature in GitHub but that allowed building debs and RPMs for different architectures. Alternatively a webhook based third party that could do the same would be awesome (anyone fancy building one? I might pitch in). The only real advantage of it being GitHub would be it would make packages immediately cool, which hopefully you all now realise that they are.

My Default Recipes For Vagrant Virtual Machines

I’ve written about Vagrant previously and the more I use it the more it impresses me and the more it changes how I work. For those that haven’t yet used vagrant the brief summary is, it’s a way of managing, creating and destroying headless virtualbox virtual machines. So when I’m sat at my computer and I want a new 32 bit virtual machine based on Maverick I just type.

vagrant init maverick32
vagrant up

It has some other magic tricks as well, like automatically setting up NFS shares between the host and guest and allowing you to specify ports to forward in the configuration file. You access the machine via ssh, either using the handy vagrant ssh command or by using vagrant ssh-config to dump the relevant configuration to place in ~/.ssh/config.

I’ve been using virtualisation for a few years, initially purely for testing and experimentation, and then eventually for all my development. I’d have a few VMware images, I’d use snapshots and occasionally rollback, but I very rarely created new virtual machines. It was quite a manual process. With vagrant that’s changing. Everytime I start investigating a new tool or new technology or work on a pet project I create a new virtual machine. That way I know exactly what I’m dealing with, and with vagrant the cost of doing that is the 30s waiting for the new machine to boot.

Or rather it would be if I didn’t then have to install and configure the same few things on every machine. Pretty much whatever I might be doing I found myself installing the same things, namely zsh, vim, git and utils like ack, wget, curl and lynx. This is exactly what the provisioning support in vagrant is for, so I set out to use chef to do this for me.

I decided to use a remote tar file for the recipes. I’m not really bothered about managing a chef server just for my personal virtual machines, but I did want to have a canonical source of the cookbooks that wasn’t local to just one of my machines. Plus this means anyone else who shares my opinions about what you want on a new virtual machine can use them too.

My Vagrantfile now looks like this:

Vagrant::Config.run do |config|
  config.vm.box = "maverick32"
  config.vm.provisioner = :chef_solo
  config.chef.recipe_url = "http://cloud.github.com/downloads/garethr/chef-repo/cookbooks.tar.gz"
  config.chef.add_recipe "garethr"
  config.chef.cookbooks_path = [:vm, "cookbooks"]
  config.chef.json.merge!({ :garethr => {
      :ohmyzsh => "https://github.com/garethr/oh-my-zsh.git",
      :dotvim => "https://github.com/garethr/dotvim.git"

You can see the cookbook on GitHub at github.com/garethr/chef-repo. By default it uses the official oh-my-zsh repo and the vim configuration from jtimberman. My own versions are very minor personal preference modifications of those. The Vagrantfile example above shows how you can override the defaults and use your own configs instead if you choose.

One question I was asked about this approach was why I didn’t just create a basebox with all these things installed by default, this would reduce the time taken on first boot as software wouldn’t have to be installed each time. However it would also mean maintaining the basebox’s myself, and as I use different Linux distributions or versions this would be a headache. While doing this and working with vagrant I’ve been thinking about the ecosystem around the tool and I’m planning on writing my thoughts on that subject over the next week or so.

Solr Libraries and Good API Design

I’m a huge Solr fan. Once you understand what it does (it’s a search engine, which means more than you think) and how it works you spot lots of thorny problems that map to it’s features really well. In my experience it’s also very fast and very stable once installed and setup. Oh, and the community support is great as well.

When I talk to some folks about Solr all they can think about is full text search. The main reason for this I think is a number of poor libraries. I’ve come across lots of Python or Ruby libraries that simply say you don’t have to know anything about Solr, just install this code and you get full text search! This works in the same way as using the default Mysql or Apache configs works, nowhere near as well as if you get your hands dirty even a little. Some of the ruby gems even ship the Solr jar file in the gem. Now you don’t even need to know Solr exists. You take a generic configuration and run it using a rake task behind which is some unknown Java application server. Good luck debugging that when it goes wrong, that’s one hell of a leeky abstraction.

In better news I’ve now found two excellent Solr libraries, one’s that start with the assumption that you know what you’re doing or happy to learn about the tools you’re using. All you really want from a library is a good API that maps to how you write in that language.

Delsolr (Ruby)

The delsolr API is beautiful. It seemlessly merges the worlds of Ruby and Solr in a way that’s easy to write and easy to guess. It’s also clever, the design accepts that new features might be added to Solr before the library is updated or that the library might not support every usecase or option. In these cases you can still pass information through to Solr directly.

Solr’s interface is based around URLs, so any library is really just giving you an interface to creating those URLs.Writing the following in Ruby:

rsp = solr.query('standard',
               :query => '*:*',
               :filters => {:status => 'Active'},
               :facets => [{:field => 'project'}]

Results in the following URL:


If you already know Solr and how to construct URLs for searches by hand you’ll immediately get the Ruby code. You can probably even guess how to pass other params like sort or order.

Another nice touch is that you can use either hashes or Lucene search syntax for each attribute. So:

:filters => {:status => 'Active'}

Is the same as:

:filters => 'status:Active'

Sunburnt (Python)

Sunburnt is a python Solr interface from the nice folks at Timetric. I’ve not had chance to use this library in anger as it was released after I’d dont quite a bit of python-solr work in an old job but I’d definately use it now. The API looks like:

rsp = solr.query('*:*').filter(status='Active').facet_by('project').execute()

It’s based around chaining so again you can probably guess how to make further queries from even this simple example.

Both Sunburnt and Delsolr also support adding documents to the index.


Once you understand facets and the usefulness of filter queries you see lots of places where Solr is useful apart from text search. Lots of ecommerce operations use facetted search interfaces, I’m sure everyone has spent time clicking through nested heirachies and watching the numbers (showing the number of products) next to the links decrease? You can built these interfaces using SQL but it’s incredibly expensive and gets out of hand quickly. Caching only helps a bit due to the number of permutations in all but the smallest stores or simplest products. It’s a similar problem with tagging, it’s pretty easy to kill your database

But it’s not just things that have the word search in that you can map Solr to. Two good examples are Timetric (from whom the Sunburnt library comes from) and the Guardian Content API. Both of these present lots of read data straight from Solr with great success and less database killing performance issues. Solr can really be seen as a simple place to denormalise your data, one advantage being that it keeps your database schema clean.

Learning More

Solr could do with better documentation for beginners. The wiki is an excellent reference once you know how to write schema and configuration files but I think the getting started section sacrifices introducing configuration in favour of getting people searching quicker. The example schema and solrconfig files that ship with Solr are also amazingly useful references (officially the best commented XML I’ve ever seen) but also intimidating to beginners. The Drupal community appear to be writing some good docs that fill this gap though, here’s a few links that I’d recommend:

Heroku For...

With the success of Heroku, both in terms of the recent sale and the fact it’s awesome, it was always just a matter of time before other languages and frameworks got into the platform as a service game. Here’s all the one’s I know about so far, many of them in or entering beta testing at the moment. Any others I’m missing?

Update Thanks for all the comments on here and on Hacker News, I’ve updated this list with all the suggestions.





Java (JVM)



Multi Platform

A Vagrant Ecosystem

As mentioned loudly and repeatedly on here and on Twitter I love vagrant. While writing a chef cookbook to bootstrap my virtual machines I started thinking about how things around vagrant could help it be more useful. These might be things I’m going to do, or ideally get involved with others to do. If anyone has any other ideas, or suggestions please leave comments, I definately think this is the time for discussion.


I don’t really want to have to maintain baseboxes but I want access to lots of them. I’m sure some people will want a Ruby on Rails in a box but all I really care about is having access to recent 32 and 63 bit vanilla linux distributions. I want a good source for trusted baseboxes. At the moment the approach is to look on the wiki, then look on the mailing list and then search the web, then create your own (even using VeeWee it’s still a little fiddly). I’ve managed to find good lucid, maverick and debian boxes, but have had problems with centos and a few others. Part of this is the rate of change recenty of both vagrant and now VirtualBox (both good things), part of it is the lack of reviews and shared experiences around baseboxes.

What I’d love to see is a single place where anyone can post a link to a basebox and vagrant users can come along and assign metadata about whether it worked and on what hardware, vagrant version, virtual box version, etc. It could even act as a tracker, counting downloads of boxes to gauge popularity.

Templated Vagrantfiles

As mentioned previously I have a chef cookbook I use to bootstrap all my new virtual machines. My process is therefore: vagrant init, make some manual changes to the Vagrantfile (or copy it from elsewhere), vagrant up. I’m lazy and want a nicer way to reuse Vagrantfiles or to script their creation.

I started out thinking that the ability to point the init command at a template and to provide context on the command line might be a good idea. Now I’m wondering whether we just need a command line application which allows for writing or modifying the Vagrantfile? Something like:

vagrant config vm.provisioner=:chef_solo
vagrant config chef.recipe_url=http://cloud.github.com/downloads/garethr/chef-repo/cookbooks.tar.gz

Hosted cookbooks

I dissed the idea of a Ruby on Rails in a box basebox above but I still want to be able to let people more easily share custom configuration for specialist applications. But what I’d prefer would be people sharing packaged cookbooks, a bit like I’ve done for my default virtual machine setup. Again the beauty of this is it’s pretty much just sharing a URL to a tar.gz file. This makes more sense to me at least than random people connecting to my chef server (I shouldn’t know about their machines) and lowers the barrier to entry for those not interested in hosting their own chef server or using the opscode platform for local virtual machines.

I’m also not talking here about just sharing individual cookbooks like cookbooks.opscode.com, but rather a packaged collection of individual recipes designed for a specific purpose. A fully working solr instance, a django application server using apache/mod_wsgi, etc.

Many of the points about baseboxes above would work here too I think. Having a good community resource which points to lots of cookbook tar files. Allowing people to feed back about what works for them. I’ve mainly taked about Chef here as that’s what vagrant initially shipped with, with the puppet provisioner now ready to go with would stand for puppet manifests too.