Resourceful Vs Hackable Search URLs

I often end up pondering URL design given a moment and something that keeps coming up is hackable search queries. But first a very quick primer on the idea of resourceful design.

REST is a series of architectural principals more than a defined architecture. The Resource Orientated Architecture builds on those ideas with a series of concrete guidelines put down by Sam Ruby for designing RESTful systems. The simple version is that you try to design your system around resources represented by URLs.

I’d thoroughly recommend reading RESTful web services whenever you get a moment as this subject is covered in detail.

Flickr isn’t a truly resourceful design but it does have many of the hallmarks. For example the URL that describes me is at:

<code>http://flickr.com/people/garethr</code>

When it comes to searching on flickr we have:

<code>http://flickr.com/search/?w=all&q=pubstandards&m=text</code>

The pattern of using a query string argument named q to pass a search string is pretty common. One of the guidelines mentioned as part of the ROA discussed query strings:

Query string parameters are appropriate if they are inputs to a Resource which is an algorithm. Otherwise, these values should be moved into the URI.

Search is definitely algorithmic. Now you could maybe argue that a global search should be done on the root of a site, with specific resource searches on the resource in questions. eg. /people/?q=. This would likely work fine but require some behind the scenes complexity as well as probably not being as obvious to the end user. Global searches are in many cases much more common that restricted searches and even in resourceful designs the root of the site (ie. the home page) might not act as a list of available site resources. A notable exception might would have to be the excellent BBC Programmes site which is basically one big semantic catalogue.

But we have another kind of URL that’s cropping up for search results, one that treats the URL much more like a fundamental part of the user interface for search. An example from a site I use all the time is The Accessible UK Train Timetable which allows for URLs like the following:

<code>http://traintimes.org.uk/newcastle/london</code>

You can basically squash all the search parameters from the form into the URL, meaning you can easily bookmark search results. Note however the actual content is likely to change. The above example for instance would use the current time to get a list of trains from Newcastle to London. In an hours time the results will be different.

Another good example would be the new Yahoo! UK TV listings which has URLs like these:

<code>http://uk.tv.yahoo.com/listings/bbc-two/2008-04-17/</code>

<code>http://uk.tv.yahoo.com/listings/itv1/2008-04-17/21-00/</code>

Again this is really a search query, or at least specifying the time and date is. In some ways it’s the return of the command line - allowing searches to be run very quickly from a textual interface.

Now, both these approaches treat URLs with the respect they deserve. But they do have the potential to clash somewhere in the middle if care isn’t taken. The Accessible Train Times site is a single purpose site which just does searches while BBC Programmes does feature a search engine but it’s just the global BBC search which takes you off site. And if that wasn’t enough potential competition then a question raised by Simon at The Highland Fling regarding URL design and the search engine optimisation crowd go me thinking too. From being a somewhat niche area of interest URLs might just become a sort after part of a good website design - fought over by the varying disciplines of modern web design and development.

DSLs for HTML and CSS - The Future, or Just Plain Wrong?

After my previous post about Django and the web standards community a number of the comments picked up on the fact I mentioned haml under the title Other Craziness. Ok, so I was being a little over-poetic but I decided this warranted a closer look.

Haml is a markup language that’s used to cleanly and simply describe the XHTML of any web document without the use of inline code. Haml functions as a replacement for inline page templating systems such as PHP, ASP, and ERB, the templating language used in most Ruby on Rails applications.

A quick example should help. The following haml code…

<code>%div.special#primary Hello, World!</code>

…is compiled to the following HTML:

<code><div class="special" id="primary">
  Hello, World!
</div></code>

Depending on your application this could be at runtime or as part of a build step. Although primarily associated with Rails because haml is also available as a command line utility you could in theory use it with any framework or language.

My initial take on this was to call haml an abstraction of HTML but Nathan Weizenbaum, one of haml’s developers, put me straight:

Haml doesn’t really abstract HTML. Not in the same sense that, say, Rails helpers do. Since Haml has a one-to-one mapping to HTML, I view it more as an alternate syntax for HTML than an abstraction.

Lots more examples for anyone interested can be found on the haml documentation site.

After some research and some playing around with the command line version of the haml engine I decided to see what Twitter thought about the situation. Little did I realise what I was letting myself in for:

Tom Morris kicked things off:

bq.:http://twitter.com/tommorris/statuses/784959480 I’m not sure why everyone insists on clumsily reinventing HTML every few weeks (eg. wiki syntaxes, of which there are hundreds)

Brad Wright thought:

bq.:http://twitter.com/intranation/statuses/784963589 Sass is just stupid, since you’re basically writing exactly the same CSS just in shitty YAML style.

And followed with:

What’s the point of abstracting HTML? It’s not that hard

And Mark Norman Francis chipped in with:

EVIL. And not in a good, kittenish way.

A few people echoed Ross Bruniges sentiment that haml and sass are just:

bq.:http://twitter.com/rossbruniges/statuses/784954916 front-end code for those who don’t really care all that much about it and would rather create databases

I have to admit to this being my initial reaction on hearing about and looking at haml, hence the remark from the previous post. But that’s not to say everyone was negative.

Mike Stenhouse stepped in and said:

bq.:http://twitter.com/mikesten/statuses/784952890 Love haml - it’s all I use these days. More readable, dynamic and hackable. Took me a while to come around to it though…

Some of the comments were about how the use of haml might alter the dynamic of a team, to either positive or negative effect - depending on your point of view.

Mark Ng saw it as a cunning way of getting rid of the front-end guy.

bq.:http://twitter.com/markng/statuses/784952935 at first, they look elegant. Then it becomes obvious how they remove designers from the process of making markup.

Where as Olly Hodgson say it maybe as a route to get the dyed in the wool back-end writing decent markup.

bq.:http://twitter.com/OllyHodgson/statuses/784957324 They look interesting. With proper training it might be a good way to get back-end programmers creating decent HTML (shock horror!)

At present haml is very much pitched at the Rails community from whence it came. Many of the examples demonstrate benefits compared to ERB, and haml is of course written in Ruby and available as a Rails plugin. Being perceived as part of that community has obvious benefits but also some subtle costs, in particular regarding those people that don’t like Rails very much.

I’m not really convinced of the benefits in all fairness. The something else to learn barrier only gets magnified when working within a team environment. You now have to train new recruits of whatever skill level in another syntax. One that they might be able to write quickly enough but can they understand from the briefest of glances at a template? HTML might not be great here but it is familiar to everyone. Their is also the programmers abstraction. What if I can’t get the markup I want out the other side of the black box? Yes it’s open source so I can hack the box open but that causes even more problems. And while I quite like meaningful whitespace (for instance in Python), in templates which fail if it’s not quite right I see a major problem for those whom a text editor is not their best friend.

I am however interested to see whether the problems people have with haml are with haml in particular or with the overall approach of alternative syntax’s for HTML and CSS. Are DSLs (Domain Specific Languages) needed for CSS and HTML? and if so is this a possible avenue for innovation on top of slow moving standards?

Git, Ditz and Microformats

I finally got round to making use of Git, the distributed source code management tools much loved by Open Source projects like the Linux Kernel and now Ruby on Rails. It had been on the long list of things to have a look at for quite a while but for the majority of my personal projects SVN is just fine. The reason that led me to finally run sudo port install git-core was the new Ditz command line issue tracking software release on Gitorious by William Morgan.

Now, a command line issue tracker is maybe kind of niche and a little geeky. But most desktop or web based issue trackers just seem to wind me up and I like small tools that fit in with my command line centric workflow. I managed to clone my own copy of Ditz to have a play with. I ran into a few initial problems but nothing a good tutorial and a helpful mailing list couldn’t fix.

I’m not just playing around either; Ditz comes with the nice ability to generate a set of static HTML files representing your issue database. I’ve committed and pushed a few modifications and enhancements onto my clone - mainly validation, semantic markup improvements and a smattering of microformats (work in progress) in the templates and a couple of markup generation helpers in the Ruby code. I plan on working up the interface a little as well but like any good web designer I’m starting with the underlying markup structure.

A few other enhancements I’d love to see in Ditz or a clone I could pull from include the ability to use SQLite rather than the default YAML for larger projects, automatic generation of RSS feeds alongside the HTML (although I’ve started implementing hAtom) and maybe automated deployment to a remote host over FTP or SSH.

The distributed nature of Git appears to be pushed as the main advantage to those developers taking a look. But I think this is, in many cases, is not that important compared to the familiarity that comes from having used SVN or similar for a while. What I think just might be it’s real strength is that Git is a social source control system. For Open Source projects like Ditz this is a potentially game changing move. The ease with which anyone can contribute and take things in strange directions without affecting the overall effort is fantastic. It removes barriers to entry for contributors and means making small quick changes is much easier and more immediate. With both Gitorious and Github wrapping Git in a shiny web based interface that focuses on these social aspects it will be interesting to see if this leads to more or faster Open Source collaboration efforts.

A First Class Web Citizen

I’m just back from another great few days in Scotland for The Highland Fling. The biggest difference for me this year was I was lucky enough to be one of the speakers. Amongst such internet luminaries as Simon Willison, Norm, Chris Mills, Christian Heilman, Aral Balkan and Paul Boag I presented on a few web building blogs (HTTP, URLs) and on some suggestions for API design. The presentation is available on slideshare for anyone who would like a peek.

I think the topic went down well and it triggered some good discussion around the nascent issue of API design. One thing is clear - we need some good resources filled with examples on the subject. More and more people are going to be extending various bits of software with an API and it would be nice to think they will all be a pleasure to use. In the meantime is you have any features or guidelines your like to see in APIs then suggest them over here.

The rest of the conference was great. I particularly enjoyed Norms little history lesson and Aral has me at least thinking about installing all the Flash and Flex tool chain. The highlight for me was probably the format. After each presentation we got a little grilling by Paul. He came up with a few thorny questions for each of us as well as fielding questions from the audience. Hopefully my stint on the sofa made sense to a few people, certainly everyone else threw out a few interesting titbits that probably wouldn’t have been talked about otherwise. Also, Chris Mills nearly managed to make Paul cry with laughter by striking something of a rock star pose for most of his interview.

I’ve just uploaded a handful of photos from the event and surrounding geek gatherings too. All in all the event was great (again). With the close venue, interviews mixed in and general friendly atmosphere The Highland Fling had an intimate feel often missing from events. Huge credit goes to Alan and hopefully we’ll all be back next year.

Barcamp NorthEast tickets available

Quick post to say BarCampNorthEast tickets are now available. We’re starting off releasing 50 tickets and we’ll see how that goes. We’re really not sure how quickly these will disappear so get them while they’re hot. More tickets will be available later as well.

barcampnortheast.eventwax.com/barcampnortheast/register

The event is looking like its going to be great. I’m really happy we managed to find a venue that would let us do the whole sleeping over thing. And with Thinking Digital the week before and rumours of a geekdinner on the friday night it’s going to be busy few days in Newcastle.

We’re still on the lookout for a few sponsors as well so if anyone is interested in sponsorship opportunities, or just has a few questions feel free to drop me a line.

Why the webstandards world appears to be choosing Django

I’ve been noticing an interesting trend recently, not one I have any empirical evidence for mind, but one I though interesting non-the-less. Parts of the webstandards world appear to all be playing with Django. Part of this has been the odd mention down the pub, at barcamps or at SXSW this year. But the main source of information on the topic has been twitter. To name but a few I’ve seen tweets from Steve, Ross and Aral recently and Stuart and Cyril literally wont shut up about it.

What’s interesting is that this didn’t happen with Rails, not in the corner of the pub that generally talks more about markup, javascript and CSS anyway. I’ve worked on a couple of Rails projects both personally and commercially, and I’ve just launched a little pet project build with django called doesyourapi. What follows is, to my mind, a few reasons why I think this trend exists and also why I think it will continue, at least for the time being.

People

You can’t ignore the personal touch, and in Simon Willison and Stuart Langridge we already have two people who bridge the Python/Django community and the web standards crowd, at least in the UK. Personal technology choices at least are often driven by personal correspondence.

Templating

Django’s templating introduces a very simple syntax and nothing else. Rails lets you have the full power of Ruby to do with as you will within your views. Rails also makes heavy use of helpers, further adding to the complexity of views. Now I have mixed views here, based on my own skills more than anything. I know I’d feel much more comfortable throwing someone with good markup skills at a project using Django than Rails. For the most part with Django you use the html you’re used to, Rails often wants you to change this to helpers - in much the same way as ASP.NET does in fact. I think some of this comes from the Rails don’t repeat yourself philosophy obsession. Sometimes this leads to programmatic complexity which makes working with templates more akin to programming, even if it means less duplication. I’ve yet to work on a particularly complex Django project so maybe this simplicity might become a limitation to work around? Always a possibility.

Default craziness

Some of the bits and pieces that come bundled with Rails are just plain wrong, the Javascript helpers being one example. The abuse of HTTP by default in some of the scaffolding code being another. Oh, and the markup coming out of various helpers as well. In trying to help the application developer Rails gets in the way of the professional webstandards types. Django does next to none of this for you. Programmers coming from Rails might see this as missing features. Frontend types prefer this clean slate approach because it means you don’t have to fight the backend (sometimes including people) for control of the output. Note that you can work around much of this (in the same way as you can work around ASP.NET if you have to), it’s just nicer to not have to.

Other craziness

Rails people love Ruby. After all it’s better than Java (it’s also a pretty loveable programming language too). But like computer science departments everywhere many Rails people also dislike or simply put up with HTML, CSS and Javascript. If they can find a way of not having to write these and write something else (Rails people are also obsessed with domain specific languages) instead. Hence we have the likes of HAML and SASS. The problem is that us frontend loving folk quite like writing CSS (well, sometimes) and absolutely love writing HTML. Most of the time for good reasons too - just look at microformats for an example. Frontend developers tend to like using a mix of tools, predominantly backend developers not so much it seems.

Personally I find it interesting. You could quite easily flip many of these arguments around to support why so many people are using Rails. For two frameworks with similar goals and uses it’s interesting to see the early philosophical differences playing out in the real world. It might be interesting to see what happens with frameworks like Merb as well which seems to be set out to avoid many of these perceived issues with Rails. So, have anyone else noticed anything similar? Or even the complete opposite?

BarCamp NorthEast

BarCampNorthEast is go. A few people have finally got together and sorted out the long promised barcamp in Newcastle upon Tyne.

We’re going to be holding the event in the middle of Newcastle, at The Art Works on the weekend of the 24th/25th of May. That’s a whole two months away, ample time for everyone to make arrangements hopefully. From the early discussions we were always set upon going the whole hog and having a two day event. The venue is big enough for people to sleep over as well if they want which is great. It keeps the cost down for anyone visiting as well as meaning we can play Werewolf all night. We’re piggy backing on the Thinking Digital conference as well so hopefully some of the people from that will stick around for the barcamp.

We have the requisite Barcamp wiki page as well as a listing on Upcoming. Feel free to indicate your interest on either of these. Registration isn’t open just yet, but we should be opening that up next week on Tuesday 1st April at 11:00am. Look for the link on here, on twitter, on the upcoming and barcamp wiki pages and anywhere else I can think of to shout about it.

Feel free to contact me with any questions. If you’re not sure what all the fuss is about then have a look at the barcamp site. And if you’ve never been outside London before then maybe this is your chance. Newcastle is only two and a half hours away by a train with free wifi. We’re also on the lookout for a few sponsors; if your interested drop me a line.

Testing Websites with Twill

I’ve been playing with Twill a little recently. It’s a Python based DSL used for functional testing of websites. From the official website:

Twill is a simple language that allows users to browse the Web from a command-line interface. With twill, you can navigate through Web sites that use forms, cookies, and most standard Web features.

A simple example might make things clearer. You’ll need to install twill first - the instructions are available on the site. We can write tests directly into the shell so we’ll start their. For our first test we’ll write one that will hopefully fail - a test to check whether this website is down.

<code>twill-sh
>> go http://morethanseven.net
>> code 404</code>

First we fire up the twill shell then enter two simple commands. The first command, go, sends the browser to the specified URL. The second command is an assertion, in this case a check on the HTTP status code. If this website is available then it should return the HTTP code 200, if it’s unavailable then it will probably return a 404 Not Found. This test will hopefully fail, indicating that this website is up and running. In reality you’re more likely to test for the 200 status code and fail on anything else but for this example it’s useful to see what a failing test looks like.

Although pretty powerful the twill scripting language is nice and small. I’ve listed most of the commands below just to give you an idea of the sort of things that you can get up to. You can type help at the twill shell to get more information on the available commands and the individual commands themselves.

  • twill-sh
  • twill-sh {filename}
  • twill-sh {filename1} {filename2}
  • twill-sh -u {url} {filename}
  • twill-sh -n {filename}
  • go {url}
  • show
  • save_html {file-name}
  • showlinks
  • showforms
  • follow {url|url-name}
  • back
  • reload
  • showhistory
  • echo {variable}
  • formvalue {form} {input} {value}
  • submit {input}
  • show_cookies
  • save_cookies {file-name}
  • load_cookies {file-name}
  • clear_cookies
  • code {code}
  • url {text}
  • title {text}
  • find {text}
  • setglobal {name} {value}
  • setlocal {name} {value}
  • debug http 1
  • redirect_output {file}
  • redirect_error {file}
  • agent {ie5|ie55|ie6|opera7|konq32|saf11|aol9}
  • add_auth {realm} {url} {user} {password}
  • runfile {file-name}
  • extend_with check_links
  • check_links
  • check_links {regex}
  • twill-fork -n {number} -p {processes} {script}

You can also store tests in individual files as well as run a batch of tests at once. I have a couple of tests that I can run against any URL which might be a useful starting point for anyone else starting to look at testing their sites or applications. You can download these tests here.

If you unpack the zip archive and then open the folder in a terminal or console. You can then run all of the tests like so, note we’re passing the starting URL into the scripts which makes using the same scripts for multiple sites easier.

<code>twill-sh * -u http://morethanseven.net</code>

The tests included do a few things; from checking for the presence of several required markup elements and checking for an XHTML doctype to checking that all the links on the page are working.

All these examples are pretty simple and non-site-specific. For more complex form based applications you can write application browsers which fill out forms, create user sessions and do everything a user might do. Twill is also particularly useful when it comes to testing RESTful webservices with all the URLs and HTTP status codes floating about.

Own your endpoints

If URLs are people too then you better make sure you control your URLs.

Although some people use blog hosting services like blogger the majority of serious bloggers, web designers or companies generally use their own domain name to host their site. Controlling your own domain name is increasingly important when that URL is a representation of you on the internet.

With all these social networks we’re starting to have pieces of us scattered all over the place. I’ve joked previously about the utility of domain names over real names for interpersonal communication but this breaks down a little when not all the urls that represent me are owned by me. I control morethanseven.net/photos but I only have some influence over flickr.com/photos/garethr. With external hosts you also need to be aware of cybersquatting. I own morethanseven.net so no one else can use it. However with services that give you your own URL as part of registration everywhere people can cybersquat in hundreds of new domains that you might not even know about.

It’s not just web pages you have to worry about. Feeds are another example of URLs you just might want to keep control over. This last one is also something I see lots of people handing off to others - specifically Feedburner. Now I’m a big fan of feedburner and use it for the feeds on this site. But I don’t use the feedburner URL, anyone subscribing to the feeds here does so using morethanseven.net/feed/. If I decide to stop using feedburner I can, without having to upset the few people who subscribe to it by moving my feed address. It’s the same with email addresses; I love gmail but rarely actually give my gmail email address out.

So, start looking after your domain names a little more carefully. They are pieces of you scattered around the internet, and losing control of them is going to become increasingly socially painful.

Accepted for Xtech

So, while sat with a few people at the WaSP panel at SXSW (of which more later when I’m fully caught up) I got a nice email from the folks at Xtech accepting my presentation idea. The abstract is below. If any of that sounds interesting or up your street I’d love to hear other peoples experiences or ideas on the subject.

Design Strategies for a Distributed Web

From language frameworks to APIs

Everyone is making use of mature and stable web application or javascript frameworks these days. That means we’ve stopped reinventing things such as routing and object relational mapping, but we’re all still building very similar components again and again. Frameworks might allow us to solve lots of fine grained problems but APIs could let us solve common course grained problems quicker.

Building blocks for your applications

Their are already a few examples in the wild of APIs designed to be used as part of your application development process. Amazon has been leading the way in providing remote services such as S3, EC2 and SimpleDB. Their are also options when it comes to hosting these services yourself, the mint analytics software and the CouchDb database service are both good examples.

Quality engineering for free

The real value of outsourcing a discreet portion of your application to a third party API lies in quality. You could always use local storage and your programming language of choice to deal with a large volume of file read and write operations. But do you really think you’ll beat Amazon for reliability, scalability and speed?

Functionality vs Data

It’s not just high quality functionality that we could leverage from other providers. We’re all fed up with entering and re-entering our personal data into each new service. With advancements like OAuth and Microformats and lots of focus on data portability at the moment we might just be able to share data too.

Change the client as well as the server

Sometimes it’s not enough to just change the server. The rise of specialised browsers such as Joost and Songbird allows for functionality that would be impossible otherwise. Site specific browser, along with advancements such as local storage, may prove Problems

It’s not all in place just yet. The reliability of your application is likely to be important, and making use of a distributed set of APIs could leave you in the unenviable position of being less stable than your least stable partner. The issue of lock-in could also raise it’s head, without a vibrant ecosystem of different providers that is.

The Future

The use of third party commercial APIs has the potential to change the development landscape – bringing high quality middleware to the web. It could be the original web services dream realised. But without critical mass and an active market it could also be a new achilles heel for successful startups.