Testing WSGI Application with Urltest

I found myself wanted something to make writing high level, functional tests for WSGI application easier and quicker. If I liked the term I’d call it a domain specific language for testing URLs. Basically I found myself writing a lot of tests like:

pre. def test_404_handler(self): response = self.app.get(‘/does-not-exist’, expect_errors=True) self.assertEquals(“404 Not Found”, response.status)

Testing more than a view URLs like this got boring quickly. What I wanted was a short hand syntax for defining this sort of simple test and then running them all individually. So was born Urltest. It uses the rather nifty Webtest module and hooks into unittest from the standard library. You’re test script then looks a little like:

pre. #!/usr/bin/env python from example_app import application from urltest import verify_urls if name == “main“: urls = ( {‘url’:“/”, ‘code’:200}, {‘url’:“/bob”, ‘code’:200}, {‘url’:“/jim”, ‘code’:404}, {‘url’:“/jim”, ‘method’: “POST”, ‘code’:405}, ) verify_urls(urls, application)

Let me know if you use it as at the moment this is works for me ware, although it’s reasonably well tested and commented.

Another Glue Python Framework - MNML

Although still a big fan of Django, but for some problems I’m finding more and more cases where I prefer less code and more freedom. My biggest issue for some types of problems being Django’s assumption that you’ll be using a relational database, or a database at all. Django wasn’t the reason I started using webapp for App Engine stuff, but in doing so I found that webapp often did all that I needed.

So when I small, non appengine project cropped up I started looking at the different options available and played with a few of them.

  • I played with Pylons but again got lost in code. I’ll probably play with Pylons more in the future and for bigger, team based, projects it looks a good mix of component parts and shared conventions.
  • Web.py - I’d used web.py before I started with Django (I even wrote a very basic PHP clone) and although I still like somethings about it, it felt like more code than was required for what I wanted.
  • Juno is similar in design to Sinatra but again it wasn’t really what I was after this time. I prefer separating my routing from my code and I’m not sure I like that it comes with it’s own templating engine.
  • Newf was more like it. Basically a hugely stripped down WSGI framework which provides the very basic building blocks. Something to build on perhaps?
  • MNML (by an ex-colleague Brad Wright) is build atop Newf adding a few more features and cleaning up some of the interfaces. My only problems here were that I prefer regex based routes and wanted individual methods for each HTTP verb. The former was a specific design decision Brad had made in order to be able to reverse routes, the later was on the todo list.

So, I set about forking MNML to create my own branch. I added extra comments as I was making my way through the code, wrote a few tests to checks thinks worked and allowed for pluggable routing mechanisms. MNML applications look a bit like the following:

pre. from mnml import RegexBasedApplication, RequestHandler, HttpResponse, development_server class HelloWorld(RequestHandler): def GET (self): return HttpResponse(”

Hello World

“) routes = ( (r’^/$‘, HelloWorld), ) application = RegexBasedApplication(routes) if name == ‘main‘: development_server(application)

If you want to use the token based routing you would substitute in something like the following:

pre. routes = ( (‘/’, Foo), (‘/myview/:stuff/‘, Bar) ) application = TokenBasedApplication(routes)

The best bit is that it’s only about 350 lines of code, a great deal of which is accounted for by comments. It’s also really quite fast - especially using something like spawning to run the WSGI application. The other thing I like is the ease with which you can add WSGI middleware into the mix.

So, if you have a small scale problem where simple and fast beats everything else then have a look and let me know what you think. It will take less time to read the code and tests than it will be read the introductory chapter on whichever larger framework you choose to look at.

Beyond Basic Web Development

I did a talk at the recent barcamp North East on web development tools. Specifically I wanted to talk about the fact that an awful lot of people just use the basic stack of tools they are familiar with. So Microsoft people will just use C#, MSSQL and ISS and lots of people just use PHP, MySQL and Apache. I’m not saying their is anything wrong with those tools, but if they are all you have in your tool box you’re limited how well designed your software can be.

I’d knocked the presentation together in my hotel room pretty quickly before heading down to barcamp and the lack of an internet connection meant I didn’t have links and didn’t cover a few tools I should have. It did seem to have the desired result in any case as several people spoke to me afterwards about wanting to use one of the tools I mentioned for something specific.

We also had a good discussion afterwards and people mentioned a few other tools.

Now it’s possible to spend too much time playing with small tools that are likely to be peripheral to the bulk of your application. But the number of stories I’ve head of people writing their own messages queuing systems in PHP, or using PHPMyAdmin as an application admin interface or ignoring the fact that their fancy new application only supports a few people at once.

Universal Internet Explorer 6 CSS and different types of sites

Andy Clarke, as only he can, has started something of a slagging match with his proposals for a single, central IE6 stylesheet. My first impression was that this is basically a much better version of the browser defaults.

Between backslapping and shouts of heresy there are a few good comments floating on the post so far (I’d expect more). But most of them seem to assume only two types of website exist:

  1. Websites for clients. You know, business to business or business to consumer sorts of things. Online stores, radio station websites, newspapers, company brochures, etc.
  2. Personal blogs

Obviously that covers only a fraction of web pages, but it covers a much larger proportion of web sites. What do I mean by that?

  1. Admin interfaces. Available all over the place and often used by a very small number of people.
  2. Intranets. If your company default browser isn’t IE6 then don’t spend more time that you have to supporting it.
  3. Internal application interfaces. Everything from holiday to payroll to IT helpdesk requests are often built as well apps. Again if the audience using them isn’t using IE6 then don’t waste you’re time.

Most people (ok so this might be my opinion) seem to work within small to medium sized agency style places. Smallish companies need to rely more on employees as part of their marketing effort, younger people tend to be more militant and people working for smaller organisations (like Andy) tend to benefit more from a little celebrity. But that ignores an army of people who work in house or in other types of company that just happen to build top notch web sites or applications. I’ve now worked in everything from small agencies, via medium sized agencies to freelance and inhouse. They are all different and all place different types of time pressure on the people involved.

Another argument to be had here is summed up by Sion (http://twitter.com/sionnnn/status/1872141409)

if u write code with all browsers in mind from the offset it doesn’t take any longer though? hows that commercially unsound?

And this is pretty much how I used to work. But this means ignoring whole swathes of the unsupported parts of CSS2 and in particular CSS3. CSS3 doesn’t really allow you to make designs that you couldn’t make before (ok, that’s a little unfair, maybe) but it does allow you to do what you did before much more efficiently. Multi-column layout stuff, multiple background, rounded corners, opacity, RGBa. Smaller stylesheets are easier and quicker to write, test and maintain. And sometimes the time saved is worth more than the additional overhead of writing and maintaining CSS for IE6. Maybe not for a decent sized consumer project with a reasonable team and a few hundred thousand budget. Maybe not for a local council or government site. But for a surprisingly large range of other types of projects this might be a worthwhile approach.

Something else I think that comes through in many of the comments is that it’s often seen as the web designers job to fight for things. So we fight to ensure time is spend on making something accessible. We fight to make sure we use valid code wherever possible. We fight for sensible fallbacks when javascript isn’t available. But we also get accused of not being able to see both sides and at times being unrealistic. Worth considering with regards peoples initial reactions to this, mine included.

The only thing I would say to anyone looking to use this is don’t create your own version. Use the one from google code. As Andy suggests, do suggest improvements, ideally by making the changes yourself presumably along with a solid test case. The time saving benefits basically disappear if we all have to maintain our own versions, before that means we have to test our own versions.

I’m not saying ignore IE6, and neither is Andy (I don’t think). I’m saying pick your battles. I’ve build websites to support everything under the sun. Sometimes it’s been absolutely the right thing to do, sometimes in hindsight it’s probably been a little bit of wasted time. I’m sure you can think of similar times from the sites you’ve been involved in.

Back in Toon - Thinking Digital

After what seems like longer than a year I’ve finally managed to make it back up to Newcastle. It’s the Thinking Digital conference again this year and so far it’s been a hoot. A mix of practical, inspirational and just odd speakers (and acts) suits me pretty well. Lots of twitter activity too.

The highlights for me so far I think has been Dan Lyons talking about the future of media and print businesses. Basically a somewhat rambling attempt to describe where newspapers and print publications find themselves in this day and age (in trouble) and where they might head for salvation (either dumbing down or becoming more exclusive/expensive). Throw in interesting stories and anecdotes, the odd joke and some personal thoughts on the future for journalists and everyone’s happy.

An honorable mention goes to Johnny Lee Chung as well for showing off drag and drop between a laptop and a table. Oh, and everything else has been pretty interesting as well and it’s not even the end of day one yet.

I’m around until next week too, which means I should be able to get along to barcamp - assuming my mind hasn’t melted with all the stuff to take in.

Keeping Up With The Zeldmans - (Self) Education for Web Professionals

So, it was the Bamboo Juice conference last Friday at the rather impressive Eden project in Cornwall. Along with Jeremy, Dom, Paul and Relly I presented to the crowd of mainly local first time web conference goers.

It was a great event, and felt a lot like the first Highland Fling in that it was the first big event in an area that’s actually quite a distance from the bright lights of London (it took 8 hours to get back to Cambridge). The party afterwards, held in one of the biomes at the Eden project itself, was also pretty impressive.

I’ll hopefully right up some of my notes from the presentation soon, but for the moment I’ve uploaded the slides to Slideshare. The whole thing was video’d as well so look out for those in the future.

Simple issue tracking

Another project I hacked together on the train running on App Engine I’m afraid. Anyone getting bored of my new project each week posts please stop reading now.

GitBug interface

Issue or bug tracking is just one of those things we all deal with and probably have opinions on. Lots of open source software exists (BugZilla, Trac) to do the job and various companies have commercial products (Lighthouse, Sifter, Fixx). So why did I go and create another one?

Certainly not because I think I can do a better job for the majority. This is very much a personal pet project and I don’t very much fancy competing with teams of smart people building good products.

Like most software projects my issue tracker was designed to scratch a personal itch. I now have 28 repos over on GitHub (how did that happen?). Some of these are public projects I’d like other peoples input on, and for me that means having a public list of issues. But these aren’t active projects being worked on by a team of people - it’s mainly just me (with Brad and Simon occasionally correcting my spelling and grammar) hacking on the train to work and back. I don’t need collaboration features. I don’t want to be limited to a small number of projects. I’d rather not pay lots of money, or worry too much about hosting.

So that’s where GitBug comes in. Bug tracking for people with minimal needs.

The only real feature past a very minimal bug tracker is the ability to close bugs via a GitHub webhook. Oh and lots of JSON and RSS feeds for everything but that’s just the way we should build things now anyway.

Their are areas of the interface that could do with some streamlining and a couple of open bugs that don’t yet cause me enough pain to fix. I have however opened it up for anyone with a Google account to use and a few people have started adding their own projects and issues already. I’m hoping that for the most part by keeping it as simple as possible I can avoid having to do any work on it and work on bugs in other projects instead, of which I’m sure there are many.

Support for Rev=Canonical

There has been lots of talk recently about URL shortening. Services like TinyURL have been around for a good while, offering shortened versions of URLs like tinyurl.com/dd7w2m which are easier to put in a tweet or an email. The problem with this is that not only does the shorter version mask any information about the destination, but if TinyURL or one of the other shortening services goes away, or loses control of it’s domain name, a large number of links are going to stop working the way they should.

Kellan in particular has been proposing some simple steps that might get us out of this hole. You can read more about the ideas behind using Rev=Canonical and try out the future (maybe) of these services at revcanonical.appspot.com.

The nicest thing from my point of view about this idea is how simple it is to implement. This blog is running a custom Django based blogging engine called Train.

The posts on this site exist at urls like the following: morethanseven.net/2009/04/04/mixing-it-programming-language-choice/. With only a small view function, a change to a template and the addition of a url this blog should now work with Kellans new url shortener.

I decided to use the ids for the articles on the blog as the key for the short versions. So if you were to visit morethanseven.net/284 you would get the article above. I decided to issue a redirect from the short version to the long version in the end rather than serve duplicate content with the canonical link, not sure which way is probably best though.

The markup for each article contains the required link in the head of the document:


And the django view looks something like this:

def tiny(request, id):
    "Provide tiny urls based on ids for articles"
    # get the article or throw a 404
    article = get_object_or_404(Article, id=id, status='live')
    url = article.get_absolute_url()
    # redirect to the relevant
    return HttpResponsePermanentRedirect(url)

All in all, incredible simply to implement, especially in something like Rails or Django which make this sort of wire up urls to view stuff easy. So what’s stopping you adding this to you site or current project? If enough people just do it we can make the web a slightly better place in reasonably short order.

Mixing it Up - Programming Language Choice

So the Register article about Twitter seems to have kicked over yet another Ruby/Rails doesn’t scale debate - mainly it seems from people who haven’t read any of the back story or the real meat of the story. For anyone catching up I’d suggest reading this recent interview with three of the Twitter developers. Ikai Lan made some particular good points about people who don’t RTFM and the comments are well worth reading too. Tony Arcieri, of Reia fame, took another approach and wondered why non of the open source message queues every got a look in

What it really all comes down to is that Twitter are using more than one language to write their systems in. What I don’t understand is why this is a shock to anyone, or why it’s a bad think? Google appear to use Java and Python for most things. Yahoo! use Java and PHP (and C and Perl I think?). Microsoft use VB and C/C**/C# and probably F#. At work we use .NET and Python for different things.

Big companies have been using multiple languages and platforms for good and bad reasons for ever. Sometimes it’s about legacy systems, but often it’s about using the right tool for the job. I think lots of people jumping in on the debate do so from a point of view that everyone uses one language for everything, mainly because most personal projects or small agency style jobs do just that. Why overcomplicate smaller projects with the need for people to know more than one language? In a small general purpose team it’s also going to make recruiting and getting people up to speed much harder.

But for startups doing interesting things it’s potentially both more efficient and more interesting to use multiple platforms. I think Dopplr might be mainly Ruby with a smattering of Erlang, all built around an ApacheMQ message queue, and Matt has talked about that architecture at various conferences without being called out on it.

So for your next personal project why not pick a handy message queue (personally I like RabbitMQ and StompServer), and at least two complimentary languages and see how it changes the architecture of what you build? Mix PHP with an Erlang backend or go for Twitters Ruby/Scala mix. It might very well be overkill for that blog or todo list application you had in mind, but it just might teach you more about picking the right tools for the job when you come across non-trivial problems.

Google Search New Features? Timeline Search and Wonder Wheel

Looks like I’m being experimented on. I just got a strange Web button appearing on a search today and decided to click it. It revealed a host of new Google features (at least I’ve never seen em before), including various filters and visualisation tools.

The entertainingly named Wonder Wheel was my first click. It’s a visualisation which shows related search terms. I’d love to have access to that data via an API as well.

Example of Wonder Wheel search on Google

The timeline view, and time based filters, look more useful for most search activities. Just getting hold of recent content on a particular search term is the sort of thing blog search engines have sended to do well in the past and that Twitter search has been showing off.

I hope I keep these features during whatever testing is going on and that they all go live soon. It’s a pretty big improvement if you ask me.