Google Search New Features? Timeline Search and Wonder Wheel

Looks like I’m being experimented on. I just got a strange Web button appearing on a search today and decided to click it. It revealed a host of new Google features (at least I’ve never seen em before), including various filters and visualisation tools.

The entertainingly named Wonder Wheel was my first click. It’s a visualisation which shows related search terms. I’d love to have access to that data via an API as well.

Example of Wonder Wheel search on Google

The timeline view, and time based filters, look more useful for most search activities. Just getting hold of recent content on a particular search term is the sort of thing blog search engines have sended to do well in the past and that Twitter search has been showing off.

I hope I keep these features during whatever testing is going on and that they all go live soon. It’s a pretty big improvement if you ask me.

Github Links to Lines of Code

Just saw this and thought it was cool. You can link to a specific line, or set of lines on GitHub. All you need to do is append something like #L17-24 to specify highlighting lines 17 to 24.

Hacker Posts

I’ve been playing again with App Engine, and going back to an on/off pet project that I’ve build variations of for a while.

Hacker Posts

It’s basically a pretty straightforward aggregation platform, taking content from a number of feeds and creating relationships between the items. It’s mainly an experiment in creating a decent size site on App Engine - it can be surprising how many urls you can get out of a good corpus of data:

  • 400 items, each with a unique url
  • each item has a variable number of tags generated, at the moment that amounts to about 1000 unique urls
  • each month and year of content gets a url so that’s another 4 or so
  • each day for which we have items we also get a url, so that’s another 90 or so

So starting off with 1500 or so pages isn’t bad. The site also grows over time as new items are posted or I add more feeds. Everything else I’ve done with App Engine has been more application focused so seeing how a content site performs is interesting in and of itself.

The data in question I used for this first experiment was the feeds I could find from the top posters on Hacker News. Hence the name Hacker Posts. That left me with 35 feeds:

I have a couple of other communities or events that I’d like to do the same thing around, as well as a few features I want to add to the underlying software. The nice thing with App Engine is rolling new instances out is as simple as running a command.

Webapp custom filters

The webapp framework wish ships with App Engine uses the Django templating system by default, but without Django apps doesn’t support the same mechanism for loading template tags and filters. This is how to do it though using a few webapp.template methods.

Simple WSGI Middleware (for App Engine)

WSGI is the Web Server Gateway Interface. It is a specification for web servers and application servers to communicate with web applications (though it can also be used for more than that). It is a Python standard, described in detail in PEP 333.

For Ruby people WSGI is the Rack in Python. In fact it was one of the inspirations behind Rack. Rack descriptions itself as:

Rack provides an minimal interface between webservers supporting Ruby and Ruby frameworks.

Which I think is a clearer explanation, except in WSGI’s case we replace Ruby with Python.

As well as being able to write WSGI middleware for Django or Pylons we can also write WSGI middleware for App Engine applications - which is what I spent some time doing today. For the most part I found the examples and documentation interesting but overkill for what I needed to do. Specifically I wanted a piece of middleware which modified the response content, adding extra content into the response. Most of the examples I found didn’t focus on middleware, or where full blown examples making them hard to follow.

So for anyone looking for a simple example of WSGI middleware which adds content into the response here goes. I used the WebOb framework because it provides a nicer interface to the request and response objects and it’s included in the standard App Engine SDK. The following sample middleware simple adds Hello World to the end of every response.

pre. from webob import Request class SimpleMiddleware(object): “Example middleware that appends a message to all 200 html responses” def init(self, app): self.app = app def call(self, environ, start_response): # deal with webob request and response objects # due to a nicer interface req = Request(environ) resp = req.get_response(self.app) # add a string to the end of the body body = resp.body + “Hello World” # set the body to the new copy resp.body = body return resp(environ, start_response)

In reality you might want to append something to a specific place in the response, or introduce conditionals. This is easy enough to do by parsing the initial value of resp.body in the example above.

To use the middleware in your application you simple wrap your current WSGIApplication instance with the middleware class.

pre. application = webapp.WSGIApplication(ROUTES, debug=settings.DEBUG)

  1. add simple middleware application = SimpleMiddleware(application) run_wsgi_app(application)

WSGI middleware is both a useful place for common functionality to live in your App Engine application as well as being a handy tool for anyone working across multiple Python frameworks to share code.

XMPP and Queues in App Engine via Jaiku? Not quite yet

So JauikuEngine, the open source, App Engine based, version of Jaiku is now available for everyone to look at. I found the repo a couple of days ago but it was restricted to project members. The main reason I want to hunt through the code is to have a look at what I’m guessing will be API’s available in a soon to be released version of App Engine - with specific interest in anything to do with XMPP, queues and offline processing.

Well it looks like I’m out of luck for the moment at least. In the settings file I found the following two snippets though:

pre. #

  1. XMPP / IM #
  2. Enabling IM will require a bit more than just making this True, please
  3. read the docs at http://code.google.com/p/jaikuengine/wiki/im_support IM_ENABLED = False
  4. This is the id (JID) of the IM bot that you will use to communicate with
  5. users of the IM interface IM_BOT = ‘[email protected]
  6. Turn on test mode for IM IM_TEST_ONLY = False
  7. JIDs to allow when testing live XMPP so you don’t spam all your users IM_TEST_JIDS = []

And another for queues:

pre. #

  1. Task Queue #
  2. Enabling the queue will allow you to process posts with larger numbers
  3. of followers but will require you to set up a cron job that will continuously
  4. ping a special url to make sure the queue gets processed QUEUE_ENABLED = True
  5. The secret to use for your cron job that processes your queue QUEUE_VENDOR_SECRET = ‘SECRET’

The only problem appears to be that the page referenced, code.google.com/p/jaikuengine/wiki/im_support, currently says:

TODO (termie): describe how to get IM working with Jaiku Engine

So termie, or anyone else on the inside, I’d love to know how to get this up and running?

I’ll be having a better look through the code when I get a chance. This was just the first thing I jumped on before heading out the door. I love Open Source.

App Engine Remote API calls

Not sure how I missed this but apparently App Engine (as of 1.1.9) supports remote access to your live data store. This means you can create administration applications more easily by running them locally, rather than within the limitations of the live platform. You can even run a local python prompt with access to your live datastore which is pretty neat.

Services Vs Applications: Does Rails Encourage SOA Better Than Django?

Building larger applications tends to mean splitting your codebase up some how into manageable chunks. I’m quite interested in what I see as different approaches in the Rails and Django communities:

Django tends to recommend building Reusable Apps and we have sites like Django Pluggables to catalog what’s available. You then grab a few of these applications from the web or write your own, add them run them all together as part of a single application. Pinax is probabaly the poster child for this approach. The 0.5.1 release for instance appears to have 41 individual reusable apps, many written by other people and projects.

The Rails community tends to talk more about RESTful service orientated architectures, with things like ActiveResource making this sort of thing easier. So rather than your manageable chunks being within your application they’re separate instances in their own right.

I’d be interested in hearing from more people about their experiences, in particular if you’ve gone against the grain so to speak.

RewiredState

RewiredState was awesome. 100 or so geeks plus a smattering of government types gathered in the shiny new Guardian offices in Kings Cross on Saturday to hack (the Government).

Some events like this are more productive than others, and the end of day demos included some realy impressive stuff. See for your self on the projects page

My own little project even won a prize (an invisible bottle of Champagne no less). My complain was that if you want to report an issue about the over 7000 government websites you have to do it per site. All of the sites do their own thing, which might be a nice contact form, maybe an email address or in some cases a postal address hidden on a page that’s not linked to from anywhere.

My solution was pretty simple - a centralised issue reporting and navigating tool. So you go along to the site you have an issue with and hit a bookmarklet (or a badge on the site if the site in question have been nice enough to add one). You fill in a very simple form which appears before your eyes and everything is tracked on a nice shiny website.

The advantage of all that is transparency. You can see which sites people have issue with, and also ideally which issues get addresses or at least acknowledged by the support staff for the site in question. The hack had comments so others could follow up on individual issues. It would be simple enough to have league tables and the like as well, or add tagging for a little bit of categorisation - I did only have a few hours though.

screenshot of FeedbackGov website

After a few nice comments I’m going to have to finish it off and get it up somewhere I think. All I really need to do is clean up the bookmarklet code (which was a hack in more ways than one) and add a bit of sanity checking to data entered into the system.

A massive well done to everyone involved is in order as well, especially James who I remember talking to the idea about ages ago. Congratulations all - and hopefully be back next year.

Content to Markup ratio bookmarklet

Stoyan Stefanov just released an excellent little bookmarklet to calculate a content to markup ratio

It’s interesting browsing around a few sites and comparing ratios: