Jabber, Erlang, Debugging. Things I'm playing with at the moment

I’m busy experimenting with various blogging approaches at the moment, hence the short links I’ve been posting recently. Another type of post I thought I’d give a try to was the list of interesting things. I find this sort of thing strangely cathartic - if nothing else by writing down the things I’m thinking about I won’t forget to spend time playing with them.

  • I still need to play around some more with Jabber/XMPP. I tried a little time ago to install a server locally on my Mac with only some success. I’ve now got a load of linux virtual images handy which might make that easier. What I would really like to see is a dirt simple XMPP server that you can use for local development. Something like Morbid. I’m just not sure I have the time to build one.
  • Which brings me on to virtualisation. I’m more convinced than ever that sand boxing different local environments is a good idea. I now have a stack of VMWare images set up for configuring.
  • It turns out Yahoo used Erlang under the hood for some part of the new Delicious I’m still pretty interested in actually doing something with Erlang, though what I’m not yet sure. I’ve been kicking round an idea for ages involving logging which might fit the bill.
  • I talked at barcampbrighton recently about debugging tools for django (and more broadly any development environment) that you need once you have a reasonable sized team. I’ve been busy packaging some of the tools I mentioned. I’ll either turn it into an article somewhere, multiple blog posts and/or hopefully get some code released on google code some time soon.
  • The Sphinx documentation system used by Django for the 1.0 documentation is pretty nifty. I’m investigating it for use at work at the moment.
  • I’ve been a huge fan of Textile since I used Textpattern years ago for an earlier incarnation of this site. But recently I’ve had a few niggling issues around extensibility and slight differences in implementation. So I’m pondering using ReStructuredText for my writing duties. It appears to be more powerful, more flexible and inherently extensible.

Google on Testing

Interesting testing And coverage reporting write up by the Google Engineers behind Update Engine. This is the sort of thing we keep discussing in the office.

Headless VMWare Fusion

The latest version of VMWare Fusion lets you run virtual machines in headless mode. Which is pretty handy if you’re using a Linux VM to mirror your live environment. The strange thing is that it’s not enabled by default. To enable it you need to run the following on your console: defaults write com.vmware.fusion fluxCapacitor -bool YES

Django Admin Options

Working on a decent sized Django project at work means I’ve found myself delving into Django’s admin interface more than a few times. Although it’s always possible to just use a custom template and do everything yourself it’s nearly always easier and often quicker to use the generated admin views. One of the problems with that is, even with therecent 1.0 release, some of the options are not that well documented outside the source code or in posts buried on mailing lists.

I’ll assume a little bit of familiarity with the new-forms-admin way of doing things which is now the default in Django 1.0. If you’re just getting started with building Django sites then you might want to first have a look at a tutorial or two. It’s quite different to the examples found in the original Django book or older online tutorials but it’s also much more powerful and flexible with a better separation of concerns.

We’ll start off with a very simple model in models.py which defines a simple Article class with a couple of fields.

pre. from django.db import models class Article(models.Model): title = models.CharField(max_length=200) content = models.TextField() publish_date = models.DateTimeField(default=datetime.today)

Django 1.0 introduced the concept of admin autodiscovery. By playing your admin declarations in admin.py in an application (most likely next to models.py and views.py) you can tell django to find these automatically. To enable auto loading of admin modules you can add the following to your urls.py.

.pre from django.contrib import admin admin.autodiscover()

This will load the module admin.py for each of the apps in the installed apps list. Now Lets add an admin class in your admin.py to go with the above models.py. We’ll call it ArticleAdmin:

pre. from django.contrib import admin from models import Article class ArticleAdmin(admin.ModelAdmin): pass admin.site.register(Article, ArticleAdmin)

The important line is the last one, in which we register the admin for the Article class. This will display the relevant admin views in the Django admin for that model - allowing us to add new articles, list existing ones and delete old ones. But by default the admin is quite sparse.

Once we have a few articles in the system we’ll find it hard to find them again. Lets add a few more lines to our admin.py file:

pre. class ArticleAdmin(admin.ModelAdmin): list_display=(‘title’, ‘publish_date’) ordering = [’-publish_date’] list_per_page = 25 search_fields = [‘title’,‘content’] date_hierarchy = ‘publish_date’

Lets step though each of these statements and see what we’ve done:

  • setting list_display for the title and publish_date means these two fields will appear in the changelist. This is the view you get when you hit Articles in the admin and allows you to find the article you are looking for.
  • ordering is self explanatory, in that we choose to order the items in the changelist by the publish_date rather than the auto generated numeric id.
  • list_per_page is another straightforward option, setting the maximum number of articles to show in the changelist before the list starts paging over multiple pages.
  • search_fields adds a simple search to the changelist, the fields specified set which fields to search; title and content in this case.
  • date_hierarchy is great when you have a date associated with an object. This outputs a separate filter list which displays the years by which to filter. The option you pass to this setting is the field name which stores the date.

The simple example above hopefully demonstrates the ease of which the admin can be configured. Knowing about these capabilities already built into Django can save you quite a bit of time when it comes to producing production ready admin interfaces. Except for more complex systems this should suffice. Below is a table of the Django admin options I’ve been using. If anyone has any more let me know and I’ll add them here, along with a brief description.

Option Description
model Set the model for which this is the admin
form Set the form class if one has been created
list_display Set which fields should appear in the changelist view
list_filter Se which fields should be used to provide a filter in the changelist view
raw_id_field Useful when you have a Foreign Key on another model with lots of records. This changes the default interface from the a select box to a custom widget
ordering Specify the order of the objects in the changelist
fieldset Fieldsets allow for control over the changeform view, setting which fields to display and whether to separate them out into individual fieldsets. Worth investigating
save_on_top If you have a long form it’s useful to be able to display the save buttons at the top as well as the bottom
date_hierachy Add date based filtering to the chaneglist view
radio_fields Another alternative widget for Foreign key fields, this time using radio buttons. Useful for fixed small lists of objects
list_per_page How may objects to list per page on the changelist view
search_fields Enable search for the model and specify which fields to search
prepopulated_fields Some fields might be prepopulated based on the user entering text into another field. This is often used to prepopulate slugs based on the title of an object
filter_horizontal The default widget for many to many fields is the rather shoddy multiple select box. Filter horizontal enhances this with some super javascript, making it much more usable. Never use many to many fields without this or filter_vertical
filter_vertical Does exactly the same as filter_horizontal, except the filter lists appear one above the other rather than side to side. Useful for thinner admin views

As you can see you can customise the default admin views a great deal even without creating your own templates and defining custom admin views. The best part is still that as well as being useful for demonstrations and prototypes these interfaces are useful on a live production site. Quite an achievement I think.


Open Microblogging looks pretty interesting. An open standard built upon other open standards for the purpose of passing information between micro blogging services like Twitter or Facebook.


Imified looks like an interesting way of getting started with using instant messaging bots in your applications. (Via)

Using Python and Stompserver to Get Started With Message Queues

Message Queues are cool. It’s official. Now, banks and financial institutions have been using big Enterprise Java message systems for years. But it’s only really over the last year or two that the web community at large have got interested. Wonder what all the interest is in Erlang, Scala or Haskell? Distributed systems and a lack of shared state - hopefully leading to some sort of scalability nirvana - that’s what.

Matt Biddulph of Dopplr has spoken at varying levels of technical detail on the subject over the last year or so. At barcamps and more recently at dconstruct. But you still don’t find that many people actually starting to use any of this stuff. Looking around the internet I couldn’t find that many examples of how to get started. Their are some pretty mature standards, good libraries, server interoperability, but few tutorials aimed at people who don’t know all about it.

The first problem is looking for a simple use case that most developers will have experienced problems with. The example I like to give is sending email. If you have a simple form on your site that sends email you probably just submit the request to the backend, it sends the email and then renders the success page back to the user. The problem here comes with scale. How many connections can your mailserver sustain? How many emails can you send from it before you start looking like you’ve been turned into a spam factory? At what point does the time taken for the mail server to respond to the web server cause the web server to time out or respond so slowly the user left or pressed refresh? If you’re sending lots of emails you need to think about this sort of stuff. For your average site this might not be a problem, but for the newer breed of applications or social networks this might bite you sooner than you think. You can gain more control over this process by introducing a message queue. Submitting the form simply adds a work task to the queue. A listener reads from the queue and sends the email. The advantage comes when you realise by removing the rendering of the page form the same process as sending the email you can throttle the system without affecting page rendering time.

So onto a simple working example. I’ve decided to use Python as that’s my language of choice at the moment. It’s also easy to read in a sudopseudo code sort of way. Writing these examples using equivalent libraries in Ruby or PHP should be straightforward enough. As for the message queue itself I’ve opted for stompserver which is available as a Ruby gem. So assuming you have Ruby and gem installed (good instructions for this on the Rails wiki) you can just run:

<code>sudo gem install stompserver</code>

Starting the queue is as simple as running:


This will get you up and running quickly. Stompserver has a number of arguments you can pass in to use different ports or backends but for the purposes of getting started it’s enough to just run it. This ease of use is the thing I love about stompserver. ApacheMQ is something of a tricky beast to setup, though you might want to use that in a production environment.

So now we have the server up and running we can get on with talking to it. I used the Python stomp.py library to deal with the heavy lifting. All the other modules are in the standard library. Their are equivalents for PHP and Ruby available as well.

The first script is a listener. Its job is to listen for activity on the queue and then act upon it. You pass the script an argument of the name of the queue to listen to.

<code>./stomp_listen.py /queue/test</code>

This example simply prints the messages from the queue to the console, but in reality the on_message handler would be were you act upon the message received. In our email example above it would be were you parse out the email address, subject line and message and actually send the email.

Stompserver currently exposes a queue for monitoring the queue server at /queue/monitor. You can use this script to subscribe to that queue and get information about the current state of the server. It will tell you which queues currently have items in them and if these are currently being processed.

You can run multiple instances of this script subscribing to a single queue. This is one of the real advantage of message based systems, two listeners should clear a queue in half the time. This sort of horizontal scaling is hugely useful as you grow a site or application.

pre. #!/usr/bin/python import time import sys import logging import socket import stomp

  1. the stomp module uses logging so to stop it complaining
  2. we initialise the logger to log to the console logging.basicConfig()
  3. first argument is the que path queue = sys.argv[1]
  4. defaults for local stompserver instance hosts=[(‘localhost’, 61613)]
  5. we want the script to keep running def run_server(): while 1: time.sleep(20) class listener(object): “‘define the event handlers”’ # if we recieve an error from the server def on_error(self, headers, message): print ‘received an error s’ message # if we retrieve a message from the server def on_message(self, headers, message): print ‘received a message s’ message
  6. do we have a connection to the server? connected = False while not connected: # try and connect to the stomp server # sometimes this takes a few goes so we try until we succeed try: conn = stomp.Connection(host_and_ports=hosts) # register out event hander above conn.add_listener(listener()) conn.start() conn.connect() # subscribe to the names que conn.subscribe(destination=queue, ack=‘auto’) connected = True except socket.error: pass
  7. we have a connection so keep the script running if connected: run_server()

The second script allows us to send messages to the queue:

<code>./stomp_send.py /queue/test "test message 1"</code>

The script takes a couple of arguments, the first one is the name of the queue, the second is the message you want to send.

pre. #!/usr/bin/python import time import sys import logging import socket import stomp

  1. the stomp module uses logging so to stop it complaining
  2. we initialise the logger to log to the console logging.basicConfig()
  3. first argument is the queue queue = sys.argv[1]
  4. second argument is the message to send message = sys.argv[2]
  5. defaults for local stompserver instance hosts=[(‘localhost’, 61613)]
  6. do we have a connection to the server? connected = False while not connected: try: # connect to the stompserver conn = stomp.Connection(host_and_ports=hosts) conn.start() conn.connect() # send the message conn.send(message,destination=queue) # disconnect from the stomp server conn.disconnect() connected = True except socket.error: pass

Both these scripts are pretty simple examples. In the real world you would probably want to make them a little more robust and user friendly. Both could probably do with checking they have the relevant arguments and providing help information if you run them without. I’d also probably move the hosts into a config file as it’s currently hardcoded into the scripts. I’ve also not tested them with other stomp compatible servers like ApacheMQ. In theory they should work fine assuming stomp.py works as advertised.

Overall, it’s surprisingly easy to get started with message queues. If you’ve been hearing about the advantages of distributed message based architectures but assumed you had to be Matt Biddulph to use them, think again.

Django Powered

A short break from blogging ends with a new site design. As with all these things their will no doubt be a few kinks still to work out and I’ll be adding to the design a little over the coming months. The main reason for all this change? A move to a custom CMS build using Django. This was something of an excuse to play around with Django outside work and I’m pretty happy with the results. I have all the bits of wordpress I actually used, plus a few bits I didn’t have before. More importantly I have something I want to hack on. Wordpress is a kick ass blogging tool, but keeping it updated or adding new features never seemed much like fun.

It seems like it’s the week for Django related site launches in the UK. We released the new Capital Radio site on the world last week and Nat beat me by a day or so with her new Django powered site. Look for more in the future too I would wager.

I have a whole range of posts brewing about what I found out along the way, both building a personal blog and building a large site with a big team. Keeping up with the bleeding edge of Django ahead of the 1.0 release took some doing (I’m running the latest Trunk release here at the moment). Deciding to use a combination of Spawning and Nginx for serving is a nice break from Apache as well. But that is all for later when I have a little more time.

Of Hacking, Continuous Integration and Django

I’ve not written anything here for a good few weeks, my tweeting has slowed down some and I’m behind on my feed reading. I’m going to blame the new job and the daily commute from Cambridge to London I think. I’ve definitely not been any less busy that usual:


We had an internal hackday on Thursday and Friday last week where lots of us over at GCap/Global downed tools and build cools stuff for a couple of days. This is exactly the sort of reason I took the job in London - for the opportunity to build interesting things quickly with other smart people. I got to play around with an event driven, music orientated, API hack and a more useful but less sexy documentation hack. Yes, I said documentation hack. I might be able to release the latter all being well but I need to finish it off first and kick the tyres on some internal projects


As a development team we’re using Django for everything which is proving to be huge fun. It’s a decent size project and we’re pushing Django (and in particular new-forms admin) in interesting ways. Having worked previously with PHP, ASP.NET and Rails I’m loving lots of bits of Django. I mentioned the template system before but their are lots of other things to appreciate. Some of this just comes from working with people like Simon and Rob who know Django pretty well. Some of it just from being able to write Python every day.

Continuous Integration

When not working on Django, or writing HTML, CSS or Javascipt (now mainly using JQuery), I’ve been busy pushing the benefits of Continuous Integration. As someone who is actually pretty bad at writing unit tests I like the process of working with Cruise Control as a gatekeeper. I also like automation in general so have been busy with Ant scripts and some Twill scripts as well for more functional testing. Django’s test suite is pretty nifty and easy to use. The official documentation is a pretty good starting point but I’m still on the lookout for some more in depth best practices articles.

So, between writing lots of code I’ve had less time to write words other than internal documentation. I’m finding the change pretty refreshing at the moment but want to keep up with writing every now and again. Who knows how that will work out? I have a feeling that I might blog more geeky code related stuff but time will see how that plays out.

Where are the Rock Star Web Project Managers?

In maybe a more constructive manner than yesterday I started wondering where the rock star web project managers hang out? I think we’re all aware of something of a celebrity culture within web circles. Their are a hardcore of people who’s blogs, books and conference appearances we’ve all seen several times over. And in the main I think this has had a positive effect on everyone involved. People like Jeremy, Molly and Simon have at different times acted as pretty useful barometers and yard sticks for lots of people. But these people are invariably designers and developers - not product managers or project managers.

The only person I can think of who has talked a little about the topic of project management is Meri with her new book Principles of Project Management. The topic occasionally comes up in conversation, or is mentioned by the designers and developers noted above. And their are lots of blogs (more often by developers it seems) about Agile, XP and Scrumm. But what about the practice of web product management? You can point to countless blogs written by designers and developers at the likes of last.fm, flickr and Yahoo. But where are the managers?

So, my question to you is: do you know of any great web product or project managers that blog about the discipline?