Append slashes to URLs in Django

Quick Django pop quiz. Can anyone spot the deliberate mistake in the following url definition? We’re trying to define a view called log_viewer and instructing a specific url pattern to render it.

pre. urlpatterns = patterns(“, (r’^log/?$‘, log_viewer), )

In this case our regex matches /log or /log/ using the /? optional pattern. This is because even if we only link to one format we know people will probably visit both, either by entering the URL manually or by linking from an external source.

As far as HTTP is concerned though /log and /log/ are separate URLs, even if they display the same content. The main reason this matters for public facing websites is that our friendly search engine spiders are likely to index both separately, leading to splitting the page rank as well as accusations of duplicate content which might see further erosion of rankings.

The solution is generally to issue a 301 redirect from one format to the other. This tells search engines and people alike that the canonical location for the requested content is elsewhere. You could specify the redirect manually, but this is going to get irritating quickly once you have a few more definitions.

pre. urlpatterns = patterns(“, (r’^log$‘, redirect_to, {‘url’: ‘/log/’}), (r’^log/$‘, log_viewer), )

Handily Django provides a mechanism to do exactly what we want to do by setting APPEND_SLASH to True in your settings file. Even better it’s switched on by default. So if you don’t know much about the intricacies of HTTP you still get the correct behavior. That is unless you specify your URL patterns in the format above.

You see APPEND_SLASH only works if the URL doesn’t match a specified pattern. If no pattern match is found it appends a trailing slash and checks for a match again. Because the above pattern matches the pattern without the trailing slash (/log) the desired behavior is never triggered, and the view is rendered at both URLs. So although we want to catch /log and /log/ on the front end, our urls.py definition should actually be:

pre. urlpatterns = patterns(“, (r’^log/$‘, log_viewer), )

Django has lots of useful bits of magic for doing the right thing, but unless you know what they actually do you either end up recreating functionality yourself, or find features don’t work in quite the way you thought. It’s a good argument for keeping frameworks small whenever possible, and for developers to at least know their way around the code of their respective framework.

New Version of Radiant CMS Out Today

Via tagging a new release on GitHub I see a new version of Radiant (0.7) has been released. Radiant is a really nice CMS for smaller projects, it’s used on the official Ruby site and I used it here at one point.

Go check out the blog post for the full list of the changes.

PDB and AppEngine

It turns out App Engine breaks the default behaviour of the Python debugger PDB by sending STDOUT to the browser. But with a little bit of python you can put it back in.

pre. import sys import pdb for attr in (‘stdin’, ‘stdout’, ‘stderr’): setattr(sys, attr, getattr(sys, ‘s’ attr)) pdb.set_trace()

XMPP and offline processing coming to Google App Engine

Three weeks ago I pondered whether XMPP and offline processing were coming to Google App Engine?. It was a hunch based on the upcoming release of Jaiku on App Engine. I reasoned you couldn’t really do it without XMPP and offline processing APIs. Looks like I was right.

Today Joe Gregorio announced on the App Engine Blog an update to the roadmap for the next 6 months; including

  • Support for running scheduled tasks
  • Task queues for performing background processing
  • Ability to receive and process incoming email
  • Support for sending and receiving XMPP (Jabber) messages

Colour me excited. This could be the point were we start seeing more and more interesting IM interfaces. And this ticks off several of my must haves for App Engine.

Hosting Images on App Engine

I like adding images to the occasional blog post but don’t do it as often as I want to. The reason being I can hardly ever be bothered to resize images. It means opening up a memory hogging application, fiddling around for a few minutes and then saving it out somewhere, then uploading said image to my server. All of those things bore me.

Appengine to the rescue. As an excuse to play around with the image API as well as add more pictures to this here blog I decided to build myself a small image hosting application. I really only care about the two things noted above; resizing and uploading. If I can do that in one go I’m happy.

A couple of train rides (all the best code gets written on train journeys, fact) later and I’ve deployed the first version of Image Host which looks a little something like this:

Image Hosting interface on Google App Engine

(And yes that image is an image of image-host hosted on image-host. If you’re thinking what image? then it means this experiment isn’t working at the moment.)

The application is designed in such a way as you could have as many people uploading and managing their own set of images as you want, all within one instance. The only reason I don’t just open it up to any google user account is that it would eat into my quotas if it got popular and I have no interest in policing whatever dubious images anyone might upload.

I’ve been playing with App Engine quite extensively of late. I really appreciate the SDK and the limited, but well designed and thought out APIs and testing stubs. At the moment I’m happy with where it’s at. It’s obviously early days but in my mind all I really want are APIs for offline processing, some sort of message queuing facility, XMPP support and a payment model for raising the limitations. And several of those are already slated for the near future.

If anyone wants to host their own version you can grab the code from GtHub. You’ll need your own App Engine account and your own snappy name but that’s all. If you really don’t want to do that, have a good reason to needing image hosting and ask nicely I might add more people to my version as well.

Their are a few bugs I’m going to fix when I get a moment, a few documented limitations I know about and I’m still fleshing out the test suite to cover all the functionality. But all in all a worthwhile few hours spent hacking. I’ll probably get round to writing up some pointers for testing App Engine code as well as their are a host of gotchas and the only real documentation at present is comments in the code.

PEP 374

The Python core developers are currently discussing whether to move away from SVN to a distributed version control system. It’s a worthwhile read for anyone involved in this sort of decision in any capacity. It features hands on examples of each of the contenders (Bazaar, Git and Mercurial), some interesting observations about all of them as well as some benchmarks against a mature codebase. Lots of conversations keep cropping up on Twitter about why bother switching to Git or other distributed systems - this is a good place to start whether you use Python or not.

Sinatra Simple Example

I’ve been playing around with the Sinarta Ruby web development framework recently and building a larger than usual Hello World Example. It’s describes itself as a DSL for quickly creating web-applications in Ruby with minimal effort (what is it about Ruby people and their obsession with calling everything a DSL?). In reality it’s a great little web framework. It deals with a minimal set of the things you really need to do as part of any application - URL handling and routing, HTTP request and response handling, etc. It reminds me of web.py in it’s minimalist approach which is definitely a good thing.

The following example is the hello world given on the site

pre. require ‘rubygems’ require ‘sinatra’ get ‘/’ do ‘Hello world!’ end

Which isn’t a million miles away from a web.py example:

pre. import web urls = ( ‘/’, ‘hello’ ) app = web.application(urls, globals()) class hello: def GET (self, name): return ‘Hello World’ if name == “main“: app.run()

The only real difference is the separate mapping of URLs to views in web.py, which is closer to how Rails or Django do things.

Their is quite a bit of documentation already for Sinatra, including the start of a book. The code (as with all good code these days) is on github for your forking pleasure.

As for what I’ve been up to I have a more advanced Hello World example up on GitHub. I’m wanting to get a running application that demonstrates all the basic features (except HAML and SASS support). So far I’ve got a simple bit of Rack middleware, several views demonstrating different url handling techniques, basic erb templates, before methods, configuration settings, error handling, decent unit test coverage, a rake file with a tasks for documentation, code coverage, etc. I’ve also got configuration files for running the application with Thin and using God. I’m going to add some simple database connectivity in at some point (either DataMapper or Sequel, I haven’t decided yet) and play around with writing spec tests and a capistrano recipe file. All in it’s a nice way of learning something new at the same time as producing something that might be useful. Once I have the rest of the bits and pieces I might even right a full tutorial.

I think the sweet spot for these sorts of mini frameworks are small services or little applications that just sit their are run. Both Integrity and IRCLogger both use Sinatra for instance and I think it’s used for the GitHub WebHooks as well. It’s exactly the sort of thing Google AppEngine is useful for in fact and Sinatra would likely be a closer fit that Rails if Google ever feel like adding Ruby support. Although it does depress me a little that the top four items in the public issue tracker are I want my own language.

Jsonpickle

Jsonpickle is a Python library for serializing any arbitrary object graph into JSON. The advantage over something like simplejson is the arbitrary part, simplejson throws errors when you try and serialize some types of objects. I also prefer the jsonpickle API (encode, decode) over simplejson (dump, dumps, load, loads).

Git Issue Tracking

TicGit looks great. I love command line apps and have been looking for something like this for a while. It’s described as a:

Git based distributed ticketing system, including a command line client and web viewer

pre. #>ti list

# TicId Title State Date Assgn Tags

  • 1 9ebd07 add attachment to ticket open 03/22 schacon attach,feature 2 6ca8be download attached file open 03/22 schacon attach,feature 3 bec8e9 add a milestone resol 03/22 schacon feature,milestone,ne 4 9b83ea general tag management open 03/22 schacon feature,tags 5 94f24e show expanded comments open 03/22 schacon feature,ticket 6 f3dd9b remove a ticket open 03/22 schacon feature,ticket 7 e1629e improved cli support open 03/22 schacon cli,feature

Perfect for my pet projects or working with like minded folk.

In Defence of Apache Ant

I’m a big fan of the Ant build tool. Their I said it. Nearly everyone else I end up talking to about build scripts (more people that you’d think, but OK, it’s hardly the most exciting topic of conversation) either hates it or treats it with disdain.

I’ve been using it for a few years on and off, in several jobs and for personal projects as well. I’ve used it while writing Python, .NET and PHP. It might be somewhat unfashionable (it’s written in Java and you write your commands in XML) but, for me at least, it’s incredibly handy to have around.

Ant is a build tool. It lets you define tasks in a config file (called build.xml) and then execute them via running the ant command line application. It supports dependencies between tasks as well as defining properties that can be used by multiple tasks. It supports a lot of Java specific stuff as well but also has the ability to simply execute commands on the host OS.

As a really simple example of a few tasks I use on more than one project involve simple backups and deployment.

First I set up a few properties including details of where my site files live and the SSH access details for the remote site.

pre.

The first example task simply runs a backup of everything in the target directory using scp.

pre.

If I make local changes and want to push them to the live site I have another simple task which shells out to rsync.

pre.

I know some people hate this separate arguments as individual elements. Yes it’s excessively XML but it makes everything incredibly clear to anyone who might sneak a look. And build scripts change little compared to project code so the verbosity never bothers me overly. If you really want you can put everything on one line, but I find that harder to follow and maintain.

pre.

For bigger projects I tend to create more complex backup and deployment tasks, or more often than not add in various dependencies. But you hopefully get the idea. Even for simple commands like this that would be a single line bash script I tend to use ant. I find by putting things together into a build script I’m more likely to add useful functionality to it later, and to remember and therefore run the commands more often.

A good reference for finding out more than is in the manual is the Apache Ant Wiki. More than anything it features real examples that you can learn from which with Ant is definitely the best way to discover new tricks.

I know their are a number of other tools in languages I like more. On occasion I use Rake, Fabric and Capistrano. I’ve looked at Vellum and good old make. I know others who swear by just writing simple bash scripts or using straight Ruby, PHP or Python (or not writing build scripts at all and doing everything by hand. But I like having my build scripts separate and simple. It might not be pretty or fashionable, but Ant does almost perfectly what I want it to do.