Information Security Reading List

I read quite a bit (probably a book a week or so) and one of the topics I’ve been reading on for a while is information security. In a recent conversation someone asked for some book suggestions, so I thought I’d write that up in a blog post rather than an email.

Most of this list isn’t particularly technical. It’s not a developers list of software engineering tomes. If you’re a developer or operator then I’d recommend reading some of the more policy or journalistic pieces as well for context. And if you’re just interested in the topic but nor particularly technical I’d skip the security engineering suggestions.

Note that I make no claims about this being a particularly balanced list, it’s biased towards what I find interesting to read. Hopefully you’ll find it interesting too.

Journalism

Understanding why Information Security is important tends to require some context. The following books provide that, with detailed real-world stories of criminal and government activities.

  • The Dark Net - Jamie Bartlett - an excellent personal tale of investigating the hidden side of the internet.
  • Spam Nation - Brian Krebs - everything you wanted to know about how and why Spam works.
  • Countdown to Zero Day - Kim Zetter - a detailed and fast paced description of the Stuxnet attack, and it’s implications.
  • Future Crimes - Marc Goodman - a focus on the criminal possibilities of the modern internet and the internet of things.
  • Worm - Mark Bowden - similar to the excellent tale of Stuxnet above, this is the story of Conficker and how it was discovered

Policy and context

These books are focused more on government policy and nation state threats, and the debate about the rules of war and the internet.

  • Cyber War - Richard Clarke - probably the best description of what cyber war is and isn’t, and some of the geopolitical problems emerging.
  • Cyber War Will Not Take Place - Thomas Rid - a good counter to the above book, with lots more detailed discussion of policy and definition.
  • Inside Cyber Warfare - Jeffrey Carr - really just a run through of current threats, especially organised crime.

Security engineering

  • Security Engineering - Ross Anderson - highly technical and quite epic, but definitely the best security engineering book around.
  • Threat Modelling - Adam Shostack - details descriptions of how and why to conduct threat moddelling, with lots of examples.
  • Data Driven Security - Jay Jacobs and Bob Rudis - nice examples, including code samples, of applying data and statistics tools and practices to security problems.
  • Cloud Security and Privacy - Tim Mather, Subra Kumaraswamy, Shahed Latif - a good book to read for anyone working in AWS, Azure or similar. Good discussion of concerns and compliance approaches in third party environments.
  • The Tangled Web - Michal Zalewski - everything you ever wanted to know about the browser security model
  • Silence on the Wire - Michal Zalewski - described as a field guide to passive reconnaissance and indirect attacks. Good for starting to think about non-obvious security threats

On my reading list

I’ve not read these books yet so can’t recommend them as such, but they both look good additions to the list above.

  • Data and Goliath - Bruce Schneier - a look at the large scale data collection programmes of governments and their implications for everyone.
  • Black Code - Ronald J. Deibert - the story of the Citizen Lab and it’s front line cyber researchers

Acceptance testing MirageOS installs

I’m pretty interested in MirageOS at the moment. Partly because I find the idea behind unikernels interesting and partly because I keep bumping into the nice folks OCaml Labs in Cambridge.

In order to write and build your MirageOS unikernel application you need an OCaml development environment. Although this is documented I wanted something a little more repeatable. I also found and reported a few bugs in the documentation which got me thinking about acceptance testing. I’m not (yet) an OCaml programmer, but infrastructure automation and testing I can do.

Into Puppet

I started out writing a Puppet module to install and manage everything, which is now available on GitHub and on the Forge.

This lets you do something like the following, and have a fully working MirageOS setup on Ubuntu 12.04 or 14.04.

class { 'mirageos':
  user      => 'vagrant',
  opam_root => '/home/vagrant/.opam',
}

Given time, inclination or pull requests I’ll add support for other operating systems in the future.

But how do you know it works?

The module has a small unit test suite, but it’s nice to know test the actual running of Puppet and installation of the software. For this I’ve used Test Kitchen and ServerSpec. This allows for spinning up 2 virtual machines (one for each supported operating system), applying the Puppet manifest and then making some assertions:

The above is simply checking whether certain packages are installed, the PPA is setup correctly and whether mirage and opam can be executed cleanly.

Can it produce a working unikernel?

The above tells us whether the installation worked, but not whether the resulting software allows us to build MirageOS unikernels. For this I used Bats running in the same Test Kitchen setup.

The above configures and builds a simple HTTP server unikernel, and then checks that when run it returns the expected response on the correct port.

Conclusion

I like the separation of concerns above. I can use the Puppet code without the test code, or even swap the Puppet code out for a shell script if I wanted. I could also run the serverspec tests anywhere I want to check state, which is the reason for separating those tests from the one’s building and running a unikernel. Overall the tool chain for ad-hoc infrastructure testing (quick mention of Infrataster too) is really quite powerful and approachable. I’d love to see more software ship with a user-facing test suite for people to verify their installation works.

Automating windows development environments

My job at Puppet Labs has given me an excuse to take a closer look at the advancements in Windows automation, in particular Chocolatey and BoxStarter. The following is very much a work in progress but it’s hopefully useful for a few things:

  • If like me you’ve mainly been doing non-Windows development for a while it’s interesting to see what is possible
  • If you’re starting out with infrastructure development on Windows the following could be a good starting place
  • if you’re an experienced Windows pro then you can let me know of any improvements

All that’s needed is to run the following from a CMD or Powershell prompt on a new Windows machine (you can also visit the URL in Internet Explorer if you prefer).

START http://boxstarter.org/package/nr/url?https://gist.githubusercontent.com/garethr/a1838aa68355a0766de4/raw/d92b41ee9dcad68c079d24c64bac7d1d27cf37c7/garethr.ps1

This launches BoxStarter, which executes the following code:

This takes a while as it runs Windows update and repeatedly reboots the machine. But once completed you’ll have the listed software installed and configured on a newly up-to-date Windows machine.

Docker, Puppet and shared volumes

During one of the openspace sessions at Devopsdays we talked about docker and configuration management, and one of the things we touched on was using dockers shared volumes support. This is easier to explain with an example.

First, lets create a docker image to run puppet. I’m also installing r10k for managing third party modules.

Docker

FROM ubuntu:trusty

RUN apt-get update -q
RUN apt-get install -qy wget
RUN wget http://apt.puppetlabs.com/puppetlabs-release-trusty.deb
RUN dpkg -i puppetlabs-release-trusty.deb
RUN apt-get update

RUN apt-get install -y puppet ruby1.9.3 build-essential git-core
RUN echo "gem: --no-ri --no-rdoc" > ~/.gemrc
RUN gem install r10k

Lets build that and tag it locally. Feel free to use whatever name you like here.

docker build -t garethr/puppet .

Lets now use that image as a base for another image.

FROM garethr/puppet

RUN mkdir /etc/shared
ADD Puppetfile /
RUN r10k puppetfile check
RUN r10k puppetfile install
ADD init.pp /
CMD ["puppet", "apply", "--modulepath=/modules", "/init.pp","--verbose", "--show_diff"]

This image will be used to create containers that we intend to run. Here we’re including a Puppetfile (a list of module dependencies) and then running r10k to download those dependencies. Finally we add a simple puppetfile (this would likely be an entire manifests directory in most cases). The final line means that when we run a container based on this image it will run puppet and then exit.

Again lets build the image and tag it.

docker build -t garethr/puppetshared .

Just as a demo, here’s a sample Puppetfile which includes the puppetlabs stdlib module.

Puppet

mod 'puppetlabs/stdlib'

And again as an example here’s a simple puppet init.pp file. All we’re doing is creating a file at a specific location.

file { '/etc/shared/client':
  ensure => directory,
}

file { '/etc/shared/client/apache.conf':
  ensure  => present,
  content => "not a real config file",
}

Fig

Fig is a tool to declare container types in a text file, and then run and manage them from a simple CLI. We could do all this with straigh docker calls too.

master:
  image: garethr/puppetshared
  volumes:
    - /etc/shared:/etc/shared:rw

client:
  image: ubuntu:trusty
  volumes:
    - /etc/shared/client:/etc/:ro
  command: '/bin/sh -c "while true; do echo hello world; sleep 1; done"'

The important part of the above is the volumes lines. What we’re doing here is:

  • Sharing the /etc/shared directory on the host with the container called master. The container will be able to write to the host filesystem.
  • Sharing a subdirectory of of /etc/shared with the client container. The client can only read this information.

Note the client container here isn’t running Puppet. Here it’s just running sleep in a loop to simulate a long running process like your custom application.

Let’s run the master. Note that this will run puppet and then exit. But with the above manifest it will create a config file on the host.

fig run master

Then run the client. This won’t exit and should just print hello world to stdout.

fig run client

Docker 1.3 adds the handy exec command, which allows for one-off commands to be executed within a running container. Lets use that to see our new config file.

docker exec puppetshared_client_run_1 cat /etc/apache.conf

This should output the contents of the file we created by running the master container.

Why?

This is obviously a very simple example but I think it’s interesting for a few reasons.

  • We have completely separated our code (in the container) from the configuration
  • We get to use familiar tools for managing the configuration in a familiar way

It also raises a few problems:

  • The host needs to know what types of container are going to run on it, in order to have the correct configuration. If you’re using Puppet module then this is simple enough to solve.
  • The host ends up with all of the configuration for all the containers in one place, you could also do things with encrypting the data and having the relevant keys in one image and not others. Given how if you’re on the host you own the container anyway this isn’t as odd as it sounds.
  • We’re just demonstrating files here, but if we change our manifest and rerun the puppet container then we change the config files. But depending on the application it won’t pick that up unless we restart it or create a new container.

Given enough time I may try build a reference implementation using this approach, anyone with ideas about that let me know.

This post was inspired by a conversation with Kelsey and John, thanks guys.

Using Puppet with key/value config stores

I like the central idea behind storing configuration in something like Etcd rather than lots of files on lots of disks, but a few challenges still remain. Things that spring to mind are:

  • Are all your passwords now available to all of your nodes?
  • How do I know when configuration changed and who changed it?

I’ll leave the first of those for today (although have a look at Conjur as one approach to this). For the second, I’m quite fond of plain text, pull requests and a well tested deployment pipeline. Before Etcd (or Consul or similar) you would probably have values in Hiera or Data Bags or similar and inject them into files on hosts using your configuration management tool of choice. So lets just do the same with our new-fangled distributed configuration store.

key_value_config { '/foo':
  ensure   => present,
  provider => etcd,
  value    => 'bar',
}

Say you wanted to switch over to using Consul instead? Just switch the provider.

key_value_config { '/foo':
  ensure   => present,
  provider => consul,
  value    => 'bar',
}

You’d probably move all of that out into something like hiera, and then generate the above resources, but you get the idea.

etcd_values:
  foo: bar

The above is implemented in a very simple proof of concept Puppet module. Anyone with any feedback please do let me know.

Leaving GDS never easy

The following is the email I sent to lots of my colleagues at the Government Digital Service last week.

So, after 3 rather exciting years I’ve decided to leave GDS.

That’s surprisingly difficult to write if I’m honest.

I was part of the team that built and shipped the beta of GOV.UK. Since then I’ve worked across half of what has become GDS, equally helping and frustrating (then hopefully helping) lots of you. I’ve done a great deal and learnt even more, as well as collected arcane knowledge about government infosec and the dark arts of procurement along the way. I’ve done, and helped others do, work I wouldn’t even have thought possible (or maybe likely is the right word?) when I started.

So why leave? For all the other things I do I’m basically a tool builder. So I’m going off to work for Puppet Labs to build infrastructure tools that don’t suck. That sort of pretty specific work is something that should be done outside government in my opinion. I’m a pretty firm believer in “government should only do what only government can do” (design principle number 2 for those that haven’t memorised them yet). And if I’m honest, focusing on something smaller than fixing a country’s civic infrastructure is appealing for now.

I’ll let you in on a secret; I didn’t join what became GDS because of the GOV.UK project. I joined to work with friends I’d not yet had the chance to work with and to see the inside of a growing organisation from the start. I remember Tom Loosemore promising me we’d be 200 people in 3 years! As far as anyone knows we’re 650+ people. That’s about a person a day for 2 years. I’m absolutely not saying that came without a cost, but for me being part of that that was part of the point - so I can be a little accepting with hindsight.

For me, apart from all the short term things (side-note: this job now has me thinking £10million is a small amount of money and 2 years is a small amount of time) there is one big mission:

Make government a sustainable part of the UK tech community

That means in 10 years time experienced tech people from across the country, as well as people straight from university, choosing to work for government. Not just for some abstract and personal reason (though that’s fine too), but because it’s a a genuinely interesting place to work. That one’s on us.

Using OWASP ZAP from the command line

I’m a big fan of OWASP ZAP or the Zed Attack Proxy. It’s suprisingly user friendly and nicely pulls of it’s aim of being useful to developers as well as more hardcore penetration testers.

One of the features I’m particularly fond of is the aforementioned proxy. Basically it can act as a transparent HTTP proxy, recording the traffic, and then analyse that to conduct various active security tests; looking for XSS issues or directory traversal vulnerabilities for instance. The simplest way of seeding the ZAP with something to analyse is using the simple inbuilt spider.

So far, so good. Unfortunately ZAP isn’t designed to be used from the command line. It’s either a thick client, or it’s a proxy with a simple API. Enter Zapr.

Zapr is a pretty simple wrapper around the ZAP API (using the owasp_zap library under the hood). All it does is:

  • Launch the proxy in headless mode
  • Trigger the spider
  • Launch various attacks against the collected URLs
  • Print out the results

This is fairly limited, in that a spider isn’t going to work particularly well for a mor interactive application, but it’s a farily good starting point. I may add different seed methods in the future (or would happily accept pull requests). Usage wise it’s as simple as:

zapr --summary http://localhost:3000/

That will print you out something like the following, assuming it finds an issue.

+-----------------------------------+----------+----------------------------------------+
| Alert                             | Risk     | URL                                    |
+-----------------------------------+----------+----------------------------------------+
| Cross Site Scripting (Reflected)  | High     |http://localhost:3000/forgot_password   |
+-----------------------------------+----------+----------------------------------------+

The above alert is taken from a simple example, using the RailsGoat vulnerable web application as a scape goat. You can see the resulting output from Travis running the tests.

Zapr is a bit of a proof of concept so it’s not particularly robust or well tested. Depending on usage and interest I may tidy it up and extend it, or I may leave it as a useful experiment and try and finally get ZAP support into Gauntlt, only time will tell.

Consul, DNS and Dnsmasq

While at Craft I decided to have a quick look at Consul, a new service discovery framework with a few intersting features. One of the main selling points is a DNS interface with a nice API. The Introduction shows how to use this via the dig command line tool, but how do you use a custom internal DNS server without modifying all your applications? One answer to this question is Dnsmasq.

I’m not explaining Consul here, the above mentioned introduction does a good job of stepping through the setup. The following assumes you have installed and started consul.

Installation and configuration

I’m running these examples on an Ubuntu 14.04 machine, but dnsmasq should be available and packaged for lots of different operating systems.

apt-get install dnsmasq

Once installed we can create a very simple configuration.

echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul

All we’re doing here is specifying that DNS requests for consul services are to be dealt with by the DNS server at 127.0.0.1 on port 8600. Unless you’ve changed the consul defaults this should work.

Just in case you prefer Puppet their is already a handy dnsmasq module. The resulting puppet code then looks like this.

include dnsmasq
dnsmasq::conf { 'consul':
  ensure  => present,
  content => 'server=/consul/127.0.0.1#8600',
}

Usage

The examples from the main documentation specify a custom DNS server for dig like so:

dig @127.0.0.1 -p 8600 web.service.consul

With Dnsmasq installed and configured as above you should just be able to do the following:

dig web.service.consul

And now any of your existing applications will be able to use your consul instance for service discovery via DNS.

Testing Vagrant runs with Cucumber

I’ve been a big fan of Vagrant since it’s initial release and still find myself using it for various tasks.

Recently I’ve been using it to test collections of Puppet modules. For a single host vagrant-serverspec is excellent. Simply install the plugin, add a provisioner and write your serverspec tests. The serverspec provisioner looks like the following:

config.vm.provision :serverspec do |spec|
  spec.pattern = '*_spec.rb'
end

But I also found myself wanting to test behaviour from the host (serverspec tests are run on the guest), and also wanted to write tests that checked the behaviour of a multi-box setup. I started by simply writing some Cucumber tests which I ran locally, but I decided I wanted this integrated with vagrant. Enter vagrant-cucumber-host. This implements a new vagrant provisioner which runs a set of cucumber features locally.

config.vm.provision :cucumber do |cucumber|
  cucumber.features = []
end

Just drop your features in the features folder and run vagrant provision. If you just want to run the cucumber features, without any of the other provisioners running you can use:

vagrant provision --provision-with cucumber

Another advantage of writing this as a vagrant plugin is that it uses the Ruby bundled with vagrant, meaning you just install the plugin rather than faff about with a local Ruby install.

A couple of other vagrant plugins that I’ve used to make the testing setup easier are vagrant-hostsupdater and vagrant-hosts. Both help with managing hosts files, which makes writing tests without knowing the IP addresses easier.

Buy vs Build your Monitoring System

At the excellent London Devops meetup last week I asked what was apparently a controversial question:

should you just use software as a service monitoring products rather than integrate lots of open source tools?

This got a few people worked up and I promised a blog post.

Note that I wrote a post listing lots of open source monitoring tools not that long ago. And I’ve been to both the Monitorama events about open source monitoring. And have a bunch of Puppet modules for open source monitoring tools. I’m a fan of both open source and of open source monitoring. Please don’t read this as an attack on either, and particularly on the work of awesome people working on great open source monitoring products.

Some assumptions

  1. No one product exists that does everything. I think this is true for SaaS as much as for open source.
  2. Lets work with about 200 hosts. This is a somewhat arbitrary number I know, some people will have more and others less.
  3. If it saves money we’ll pay yearly, rather than monthly or hourly.
  4. We could probably get some volume discounts from some of the suppliers, but we’ll use list prices for this post.

Show me the money

So what would it cost to get up and running with a state of the art software as a service monitoring system? In order to do this we need to choose our software. For this post that means I’m going to pick products I’ve used (sometimes only a bit) and like. This isn’t a comprehensive study of all the alternatives I’m afraid - though feel free to write your own alternative blog posts.

  • New Relic provides a crazy amount of data about the running of both your servers and your applications. This includes application performance data, errors, low level metrics and even rolled up method or database query performance. $149 per host per month for our 200 hosts gives us $29,800 per month.

  • Librato Metrics provides a fantastic way of storing arbitrary time series data. We’re already storing lots of data in New Relic but Metrics provides us with less opinionated software so we can use it for anything, for instance number of logins or searches or other business level metrics. We’ll go for a plan with 200 data sources, 100 metrics each and at 10 second resolution for a cost of $3,860 per month.

  • Pagerduty is all about the alerts side of monitoring. Most of the other SaaS tools we’ve chosen integrate with it so we can make sure we get actionable emails and SMS messages to the right people at the right time. Our plan costs $18 per person per month, so lets say we have 30 people at a cost of $540 per month.

  • Papertrail is all about logs. Simple setup your servers with syslog and Papertrail will collect, analyze and store all your log messages. You get a browser based interface, search tools and the ability to setup alerts. We like lots of logs so we’ll have a plan for 2 weeks of search, 1 year archive and 100GB month of log traffic. That all costs $575 per month.

  • Sentry is all about exceptions. We could be simply logging these and sending them to Papertrail but Sentry provides tools for tracking and rolling up occurences. We’ll go for a plan with 90 days of history and 200 events per minute at a cost of $199 a month.

  • Pingdom used to provide a very simple external check service, but now they have added more complex multistage checks as well as real user monitoring to the basic ping. We’ll choose the plan with 250 checks, 20 Real User Monitoring sites and 500 SMS alerts for $107 a month.

How much!

In total that all comes to $35,080 (£20,922) per month, or $420,960 (£251,062) per year.

Now the first reaction of lots of people will be that’s a lot of money and it is. But remember open source isn’t free either. We need to pay for:

  • The servers we run our monitoring software on
  • The people to operate those servers
  • The people to install and configure our monitoring software
  • The office space and other costs of employing people (like management and hiring)

I think people with the ability to build software tend to forget they are expensive, whether as a contractor or as a full time member of staff. And people without management experience tend to forget costs like insurance, rent, management overhead, recruitment, etc.

And probably more important than these for some people we need to consider:

  • The time taken to build a good open source monitoring system

The time needed to put together a good monitoring stack based on for instance logstash, kibana, riemann, sensu, graphite and collectd isn’t small. And don’t forget the number of other moving parts like redis, rabbitmq and elasticsearch that need installing configuring and maintaining. That probably means compromising in the short term or shipping later. In a small team how core is building your monitoring stack to what you do as a business?

But I can’t use SaaS

For some people, using a software as a service product just isn’t going to cut it. Here’s a list of reasons I can think of:

  • Regulation constrains where your data can be stored, for instance it’s not allowed out of the country
  • Sheer size of infrastructure, although you may be able to get a volume discount it might not be enough

I think everything else is a cost/benefit issue or personal preference (or bias). Happy to add more to that list, but I don’t think it’s a very long list.

Conclusions

I’ve purposefully not talked about the quality of the tools here, just the cost. I’ve also not mentioned that it’s likely not an all or nothing decision, lots of people will mix SaaS products and open source tools.

Whether taking a SaaS approach will be quicker, cheaper or better will depend on your specific business context. But try and make that about the organisation and not about the technology.

If you’ve never used the current crop of SaaS monitoring tools (and not just the one’s mentioned above) then I think you’re missing out. Even if you stick with a mainly open source monitoring stack you might look at your tools a bit differently after you’ve experimented with some of the commercial competition.