Web application security tools

I’ve become increasingly interested in web application security issues over the last year or so. Working in Government will do that to you. And I’ve come to the conclusion that a) there are lots of good open source security tools, b) many of them are terribly packaged and c) most developers don’t use any of them.

I’ve been having related conversations at recent events I’ve made it along to, including Devopsdays London which featured some good open spaces discussions on the subject. Security is one of those areas that, for many organisations, is basically outsourced to third party penetration testing firms or consultants. Specialists definitely have a role to play, but with a move towards increasingly rapid releases I think in-house security testing and monitoring is going to get more and more important.

A collection of security tools

I’ve started to build a collection of tools on GitHub, along with a vagrant setup to test them out. Full instructions are available on that repository but the short version is you can run one command and have one virtual machine filled with security testing tools and, if useful, another machine running a vulnerable web application with which to test. The current list of tools runs to:

But I’ll add more tools as I discover them or as people file issues or pull requests.

What about Backtrack?

When I started investigating tools for security and penetration testing most roads led to Backtrack. This is a complete Linux distribution packed with a huge number of security tools, including many if not all of the above. Why then did I write puppet code rather than create a Vagrant box from Backtrack? Firstly, Backtrack is probably great if you’re a professional penetration tester, but the barrier to entry to installing a new distibution for most developers is too high in my view. And with a view to using some of these tools as part of monitoring systems I don’t always want a separate virtual machine. I want to be able to install the tools wherever I want. A good configuration management tool gives you that portability, and Vagrant gives you all the benefits of a local virtual machine.

Future plans

As mentioned I’d like to expand how some of these tools are used to include automated monitoring of applications, maybe look at ways of extracting data for metrics or possibily writing a Sensu plugin or two. The first step to that is probably breaking down the monolithic puppet manifest into separate modules for each tool. Along the way I can add support for more operating systems as required. I’ve already done that for the wackopicko module which is up on the Forge.

I’m also soliciting any and all feedback, especially from developers who don’t do any security related testing but feel like they should.

Government Service Design Manual

I’ve not been writing many blog posts lately, but I have been doing quite a bit of writing elsewhere. One of the things I’ve had a hand in at work is the new Government Service Design Manual. This is the work of many people I work with as well as further afield. It’s intended to be a good starting place to find information about building high quality digital services.

The manual is in beta and we’re looking for as much feedback as possible on the whole thing. It’s already proving useful and a good way of framing the scope of discussions, but it has lots of room for improvement.

If you’re reading this post I’m going to wager you’re interest lies in or around devops flavoured content. The following are guides I’ve written in this area that I’d love any and all feedback on.

If you’re interested in the background to this endeavour then a couple of blog posts from some of my colleagues might be of interest too. First Richard Pope talks about how the manual came about and here’s a post from Andrew Greenway about this beta testing of the service standard.

The source for all this is on GitHub so if you prefer you can just sent a pull request. Or I’m happy to get emails or comments on this post. In particular if people have good references or next steps for these guides then let me know as several of them in particular are lacking in that area.

Perils of portability

I had fun speaking at QCon in London earlier this month with a talk on the Cloud track entitled the Perils of Portability.

This had some Governmenty stuff in but was mainly part rant, part hope for the future of cloud infrastructure. I had some great conversations with people afterwards who felt some of the similar pain which was nice to know. I also somehow managed to get 120 slides into a 40 minute presentation which I think is a personal records.

The videos will be available at some point in the not too distant future too.

Going fast in government

About a month ago I had the good fortune of speaking at the London Web Performance meetup. This was one of the first talks I’ve done about our work at The Government Digital Service since the luanch of GOV.UK back in October. The topic was all about moving quickly in a large organisation (The UK Civil Service is about 450,000 people so I think it counts) and featured just a hand full of technical and organisational tricks we used.

March madness

With only a week or so to go before the end of February, it’s looking like March might be a little busy.

  • I’m speaking at QCon, in London on Wednesday 6th on Clouds in Government - Perils of Portability (which in hindsight is probably the silliest title for a talk I’ve ever used)
  • On the 15th and 16th of March I’ll be at Devopsdays, again in London. I’ve been helping out with organising the event and I’m very much looking forward to going along after seeing all the work being put in.
  • And last but not least I’m heading to Boston for the rather exciting Monitorama from the 26th until the 30th. Looking forward to meeting up in person with quite a few folks I’ve spoken to over the last year or two.

If you’re going to be at any of these events (QCon and Devopsdays still have tickets available I think) then let me know.

Django and Rails presentation from QCon

I had great fun back in November at the QCon conference in San Francisco. As well as currating one of the tracks and catching up with people in the area I managed to give the following talk.

In hindsight it might have been a bit odd to try and cover both Rails and Django examples in the one presentation but it was quite good fun putting together code examples using both of them at the same time. As well as a large set of tips, tricks and tools I settled on a few things that I think any web (or other) framework should support out of the box.

  • A debug toolbar
  • Transparent caching support
  • Hooks for instrumentation
  • Configurable logging

my personal package repository

I’m a big fan of system packages for lots of reasons and have often ended up rolling my own debian package repository at work, or working with others that have done so. Recently I finally got round to setting up a personal package repo, at packages.garethrushgrove.com. More interesting than the repo is probably the tool chain I used, oh and the rather nice bootstrap based styling.

nice looking package repository

The source code for everything is on GitHub although not much documentation exists yet. In the middle are a few shell scripts that generate the repo. Around them is a Vagrant box (which makes it easier to build packages for different achitectures or distros) and some Rake commands

<code>bundle exec rake -T
rake recipes:build[recipe]  # Build a package from one of the available recipes
rake recipes:list           # List available recipes
rake repo:build             # Build the repository</code>

The recipes commands allow for building new packages based on scripts. A few examples are included which use fpm, but you could use anything. The repo:build command triggers the debian repository to be rebuilt.

The vagrant configuration shares various folders between and guest and host which also opens up a few useful features. One is I can just drop any old debian package into the debs folder and run the repo:build command and it will be in my repository. The other useful capability is that the resulting repo is shared back to the host, which means I can then check it into Git and in my case push it up to Heroku.

On the forge

I’ve been spending a bit of time recently pushing a few Puppet modules to the Forge. This is Puppetlabs attempt to make a central repository of reusable puppet modules. I started doing it as a bit of an experiment, to find out what I liked and what worked and I decided to writeup a few opinions.

So far I’ve shipped the following modules:

Quite a few of these started as forks of other modules but have evolved quite a bit towards being more reusable.

I’ve also started sending pull requests for modules that basically do what I want but don’t always play well with others.

Improved tools

It turns out the experience is mainly a pleasurable one, partly down to the much improved tooling around Puppet. Specifically I’m making extensive use of:

  • Rspec Puppet - for writing tests for module behavious
  • Librarian Puppet - dependency management for modules
  • Puppet spec helper - conventions and helpers for testing modules
  • Travis CI - easy continuous integration for module code
  • Vagrant - manage virtual machines, useful for smoke testing on different distributions

Lots of those tools make testing Puppet modules both easier and useful. Here’s an example of one of the above modules being tested. Note that it’s run across Ruby 1.8.7, 1.9.2 and 1.9.3 and Puppet versions 2.7.17, 2.7.18 and 3.0.1 for a total of 9 builds. Handily the Redis module mentioned also had a test suite. The pull request includes changes to that, and Travis automatically tested the pull request for the modules author.

Antipatterns

Using modules from the Forge really forces you to think about reusability. The pull request mentioned above for the Redis module for instance replaced an explicit mention of the build-essential package with the “puppetlabs/gcc”: class from the Forge. This makes the module less self contained, but without that change the module is incompatible with any other module that also uses that common package. I also went back and replaced explicit references to wget and build-essential in my Riemann module.

As a rule of thumb. For a specific module only include resources that are unique to the software the module manages. Anything else should be in another module with a dependency in the Modulefile.

This can feel a little much when you’re replacing a simple Package resource with a whole new module but it has two advantages I care about. As well as the ability to use the module with other third party modules more easily it also makes it more likely that the module will work cross platform.

What’s missing?

I’d like to see a few things improved when it comes to the Forge.

  • I’d like to be able to publish a new version of a module without having to use the web interface. The current workflow involves running a build command, then uploading the generated artifact via a web form after logging in.
  • I’d like to see best practice module development guides front and centre on the Forge. Lots of modules won’t work with other modules and I think that’s fixable.
  • Integration with puppet-lint would be nice, giving some indication of whether the authors care about the Puppet styleguide.
  • A command line search interface would be useful. And turns out to exist. Thanks @a1cy for the heads up.
  • The Forge tracks number of downloads, but as a publisher I don’t know how often my modules have been downloaded.
  • And finally I’d like to see more people using it.

Shipping

Last week we shipped GOV.UK. Over the last year we’ve built a team to build a website. Now we’re busy building a culture too. I’ve got so much that needs writing up about everything we’ve been up to. Hopefully I’ll make a start in the next week or so.

Tale Of A Grok Pattern

I’m all of a sudden adding lots more code to GitHub. Here’s the latest project, grok patterns for logstash. At the moment this repo only contains one new pattern but I’m hoping to add more, and maybe even for others to add more too.

First, a bit of background. Logstash is the excellent, open source, log agregation and processing framework. It takes inputs from various configurable places, processes them with filters and then outputs the results. So maybe you’ll take inputs from various application log files and output then into an elastic search index for easy searching, or output the same inputs to graphite and statsd to get graphs of rates. One of the host powerful filters in logstash is the grok filter. It takes a grok pattern and parses out information contained in the text into fields that can be more easily used by outputs. This post serves hopefully as both an explanation of why and an example of how you might do that.

The problem

Rails logs are horrible, that is until you install the excellent lograge output formatter. That gives you lines like:

GET /jobs/833552.json format=json action=jobs#show status=200 duration=58.33 view=40.43 db=15.26

This contains loads of useful information that’s easily parsable by a developer. We have the HTTP status code, the rails controller and information about response time too. A grok filter lets us teach logstash about that information too. The working grok filter for filtering this line looks like this:

The solution

LOGRAGE %{WORD:method}%{SPACE}%{DATA}%{SPACE}action=%{WORD:controller}#%{WORD:action}%{SPACE}status=%{INT:status}%{SPACE}duration=%{NUMBER:duration}%{SPACE}view=%{NUMBER:view}(%{SPACE}db=%{NUMBER:db})?%{GREEDYDATA}

That was worked out pretty much with a bit of trial and error and use of the logstash java binary, using stdin and stdout inputs and outputs. It works but getting their wasn’t that much funand proving it works outside a running logstash setup was tricky. Enter rspec and the grok implementation in pure Ruby. The project above contains an Rspec matcher for use when testing grok filters for logstash. I’ll probably extract that into a gem at some point but you’ll get the idea. Now we can write tests like these:

the lograge grok pattern
  with a standard lograge log line
    should have the correct http method value
    should have the correct value for the request duration
    should have the correct value for the request view time
    should have the correct controller and action
    should have the correct value for db time
  without the db time
    should have the correct value for the request view time
  with a post request
    should have the correct http method value

Finished in 0.01472 seconds
7 examples, 0 failures

The tests themselves are just basic Rspec with most of the work done in the custom matcher. This not only means I can be a bit more confident that my grok pattern works, it also provides a much nicer framework for writing more patterns for other log formats. Parsing rules like this are one area where test driven development is a huge boon in my experience. And with tests comes continuous integration, in this case via Travis.

I’ll hopefully find myself writing more patterns and tests for them, and if anyone wants to send pull requests and to start collecting working grok patterns together so much the better.