If computer scientists managed birth certificates

I’ve been pondering authority in microformats and as a result been taking a closer look at microID.

The situation I want to find a solution for is; If their are lots of published hcards for a given person they might be different, either out of date or more sinisterly fraudulent. If I do a search for contact details, lets say using the technorati hcard search engine, how do I know which one is to be trusted?

If I publish an hcard on my site using the following markup, including a microID using the included email and url details to form the hash.

<code><div 
   class="vcard microid-mailto+http:sha1:a52b4f86cfa4190b53634d99198c0f56970c8d83">
   <h2 class="fn">gareth rushgrove</h2>
   <a href="http://morethanseven.net" class="url" rel="me">
      morethanseven.net
   </a>
   <a href="mailto:[email protected]" class="email">
      [email protected]
   </a>
</div></code>

If the hcard search engine reads that hcard on the morethanseven.net domain it could check the hash in the vcard against doings it’s own hash with the specified email address and url (checking the url is the same as the one on which the vcard lives). “Ah ha”. It can say. This looks like an authoritive hcard.

This does however raise a couple of issues. In order for the search engine or external reader to be able to create its hash you have to include the email in your hcard. I’m also not sure it does solve the issue of evil me:

<code><div 
   class="vcard microid-mailto+http:sha1:40b46e5c349453bddd0dbd022fb33488494699cb">
   <h2 class="fn">evil gareth rushgrove</h2>
   <a href="http://spammer.com" class="url" rel="me">
      spammer.com
   </a>
   <a href="mailto:[email protected]" class="email">
      [email protected]
   </a>
</div></code>

Assuming the above is published on spammer.com then I’m sunk. It’s another authorative hcard with the same name, in this case my name. Of course therein lies another problem, when we name our children we tend to just do it on a whim, we even copy names we like from other people! And we change or subvert these names when we dont like them with nick names and the like. Domain names are unique so we avoid this problem.

So, a possible solution does present itself; using the uniqueness of domain names combined with the fact that most people aren’t evil spammers. Lets postulate a search engine finds lots of hcards for a given name. Most of these hcards probably point back to the url of the entity they are describing (so most hcards on the web describing me point back to this site). We can then check if one of the hcards is from that domain, and if it has a valid microID. In other words we use the lazy web to identify the most likely location of an authorative hcard.

Authoritative hCards - Authority in microformats

I’ve been pondering lots of potential uses for microformats of late and came up on one thing I thought worth asking what people thought; namely the issue of authority in microformatted data.

Technorati provide a very interesting Microformats search engine that allows you to search for a person and get back their contact details via hcards that it knows about. For instance try doing a search for me and you get seven results (ironic given the domain name), try with one of those famous people and you get lots more. This poses a problem, what is the various hcard disagree?

You can always look at the publication date but this might not be enough. I could publish an hcard with incorrect data for anyone right now on this site and it would be more recent that the other, possibly correct, information already available on the web. Note that this isn’t a problem with Microformats per ce, more an ever present problem with the web. It’s just that Microformats make the matter more of a potential issue.

Is their a need for the ability to provide an authoritative hCard, or other Microformat instance? One that can always be trusted? If I publish an hcard of my details I’d like to think the latest version of that would be the most trustworthy. But how do you make a claim of ownership over a piece of data on the web (microformatted or not)? And for some of this data do you need a way of deciding who has access to it and who does not in a standard manner?

I’m pretty interested in the issue of identity at the moment, spurred on by the likes of OpenID and MicroID. I’m wondering if their isn’t some clever cross over here? Sites like ClaimID are doing some things in this space already so we’ll have to see.

Working example of Microformats as an API

Drew posted a while ago asking can-your-website-be-your-api. The simple idea is that you just might be able to live without a dedicated API in favour of good use of microformats.

It also turns out Tantek has also been on the case with his presentation Can your website be your API? - Using semantic XHTML to show what you mean and

Glenn has spoken on the subject too at WebDD and BarCamp.

This has interested me for a while and I’ve finally gotten around to writing some code, in fact I wrote the bulk of it during The Highand Fling and then the rest before the Refresh event after I got to te venue an hour early.

I used Drew’s hKit for the microformat parsing so everything that follows is PHP5 only I’m afraid. The also means I dont have a working example on this server at the moment but you can grab the code and try it out yourself. It also makes use of a couple of PEAR modules; PEAR Validate and PEAR HTTP_Request.

At the moment I’ve worked on a simple hCard example. It demonstrates the idea of a website exposing methods based on the presense of microformats, plus using that exposed method to extract the information we’re looking for.

First we create objects for the parser and the website:

<code>$parser = new hKit;
$website = new Website('http://morethanseven.net',$parser);</code>

Then we’ll ask what methods the website exposes:

<code>$website->expose();</code>

In this example we get back an array of the available methods, in this case getContacts. Then just as an example lets print out some contact details:

<code>foreach ($website->getContacts() as $contact) {
   echo $contact->getFN();
}</code>

This is a very early release, more of a proof of concept and as such their is no documentation and only a portion of hCard is supported. Hell, their’s not even a name beginning with h! Having said that I have a plan for a little project to kick the tyres and so I’l l be adding to it and hopfully have a proper release at some point.

The plan is to introduce methods like getReviews, getEvents, etc. which allow for the extraction of the relevant details. The part I find most interesting however is the idea of the expose method - asking a web page if it has an API, and if it does then what information you can extract automagically. If you’re interested let me know what you think.

Download example

Software I Installed on my N800

Ok, so I’ve been shopping again but I’ll try and make this post a little more useful that oh, look at my shiny new toy.

I decided that what I was really missing to help with my twitter addiction and growing interest in the mobile web was a shiny new Nokia N800 Internet Tablet.

Nokia N800 Internet Tablet

I cant really write a fair review at the moment as I’ve only just got my mits on it so that will have to save for later. All I can think at the moment is “Oh, shiny”. What I will do is list the software I installed on it from the start, partly to demonstrate what you can get on it, partly to link to thinks I found useful and mostly to remind me if and when I need to do it again.

And just for Phil it already has GCC-3.4-base on so no need to install.

Oh, and for the nay sayers thinking iPhone I don’t care. From all I’ve read the iPhone is going to be a closed system and this is a full blown Linux OS that is designed expressly to be hackable. Plus the N800 makes people swoon now, not in six months or a year when the iPhone hits the shelves.

Even More Events

I occasionally get some stick for wandering the country attending as many web related get togethers as I can manage but in fairness their are so many good events to get along to at the moment it would be rude not to. So here’s yet another run down of some upcoming events that people really should try and get along to:

The Highland Fling

Yes! It’s The Highland Fling in only a couple of weeks and it’s still looking as great as when I bought my tickets. A nice, small, event with a host of good speakers (and good guys) and it’s not south of me for the first time ever. Props to Alan for getting this off the ground.

Refresh Edinburgh

If that wasn’t enought it’s worth staying around an extra day in Edinburgh for a Refresh get together. Again a good lineup of speakers and I would imagine a much interesting discussion and debate.

BarCamp NorthEast

With all the BarCamp goings on around BarCamp London lots of other UK BarCamps have started to get organised, including one around these parts. Heather, Alistair, Meri and myself (ably supported with some soon to be revealed logo design by Elly) have plans. Watch this space.

Barcamp Sheffield

While we’re getting out act together another northern BarCamp is about to get going. BarCamp Sheffield is happening in only a few weeks time. I’m hoping to make it along, maybe just for the saturday, but if anyone else is going along let me know.

HackDay

If that wasn’t already enought (and I haven’t even mentioned @media) their was something of a big announcement from Matt over at the BBC today. Matt, and Tom from Yahoo have put together what looks like being the event of the year over the weekend of June 16th and 17th. Hackday is going to involve 400 geeks (you and me and all our friends in other words) spending 48 hours hacking on APIs and the like then watching some suitably bemused band.

Most of these have details on Upcoming which I’m increasingly thinking could be incredibly useful if only a few more people used it. So after telling you to go to all these events (which would cost money) I’ll leave with telling you all to sign up for Upcoming (which costs nothing) and add me as a friend.

Semantic Web acronym links primer

One thing you hit pretty rapidly when you start having a look into all this Semantic Web malarkey is the number of rather silly acronyms and abbreviations. In fairness it’s true of pretty much every technical or academic discipline I’ve come across and you can ask the people I work with what I think about that - I wont ramble on here.

And dont think this is all because I’m not technical enough for ya, I have code on my blog and a faintly scary collection of technical tomes. I just think acronyms tend to breed elitism and make the world less penetrable, especially when they are often described in relation to other acronyms. So, after pointing out a problem here’s a stab at a solution; a web designers guide (with the relevant links) to my understanding of the different things involved. I’m not an expert on this yet mind, so if I’m wrong and someone more knowlegable can provide a better description then please comment. If the descriptions from the relevant links I have are good enough I’ll just use those as well.

Most people will probably have come across XML so hopefully this is an easy one. XML is the eXtensible Markup Language. For me that means it’s a set of rules for defining your own markup language; from simple data exchange formats to whole programming languages. The W3C says (which I think nicely highlights my problem from above)

Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879)

XSLT is another quite common tool. The W3C again defines it in relation to another acronym but this time I dont mind as much and we’ve got a defintion of XML here anyway.

[XSLT is] a language for transforming XML documents into other XML documents

This is a pretty straighforward description, the only real problem for the outsider is wondering why you would want to do that!

RDF W3Schools says is really where the Semantic Web stuff starts. The W3C start off with:

The Resource Description Framework (RDF) integrates a variety of application…

Ah. It integrates applications… What! But wait:

The RDF specifications provide a lightweight ontology system to support the exchange of knowledge on the Web

This is pretty good actually, as long as you’re happy with the word ontology. Their is a good article that takes a stab at the question from Tim Bray. I think, simply put, RDF is common set of rules for defining your own meta data (information about information) so it’s interopable with everyone else doing the same thing.

But how are we going to get all that RDF into our web pages (Semantic or otherwise)? RDFa is one possibility (not going there right now) which involves extending HTML (all varieties) with some additional attributes with the purpose of being able to embed RDF in the document. Or from the W3C:

RDFa is a syntax that expresses this structured data using a set of elements and attributes that embed RDF in HTML

GRDDL is probably the worst offender here in terms of someone spending too much time coming up with the acronym. It stands for Gleaning Resource Descriptions from Dialects of Languages. From looking it appears to be a standard way of saying:

Here be some RDF. If you want the RDF for this document please use this XSL transformation.

eRDF, or embedded RDF is a simpler approach to RDFa which involves no additional custom attributes but does not aim to be able to represent all the possible RDF constructs.

Of course their are many more, but at least that’s a start. Let me know what you think.

I&#39;m not a Werewolf!

After the numerous games of Werewolf at Barcamp London recently a few of us got chatting about the idea of custom Werewolf cards using Flickr. Well I’ve just finished a very simple example of this and thought I’d post it up.

Definately a Werewolf

Head over to morethanseven.net/presents/werewolf for a set of printable cards based on Flickr Machine Tags.

In the future I may enhance this with a nifty interface which lets visitors select the number of each card they want - and I spoke briefly with Stefan from Moo about if it’s possible to link into printing them on those lovely Moo cards.

If you want anyone to appear just add the relevant werewolf machine tag!

Parsing ERDF

Now we’ve got some eRDF in our pages we need to extract it out in preparation for doing someting with it.

With XSL

First up we want to try and extract the eRDF in our page into an RDF document. Ian Davis has already created a nice XSL document to do just that. and I’ve implemented a nice service wrapper to extract the RDF from a given URL. Try pointing it at morethanseven.net or iandavis.com for an example of it in action. Next step here is to extend it to allow extracting a simple vcard from the RDF in a similar manner to Brian Suda’s Microformats extractor.

With Javascript

Dan Webb has recently written up his Sumo! microformas parser and boy is it really rather fancy. At the moment he’s only got profiles for hCard, hCalendar and hResume but writing profiles is relatively simple. Accoring to one of Dans comments he’s working on adding support for rel and rev style microformats like rel-tag and XFN.

Although billed as a Microformats parser, in reality what Dan’s created is pretty generic. You can use it to parse out any information marked up with any semantic class names - just like our eRDF.

I’ve set up a very basic Sumo! profile for FOAF and some simple tests to extract eRDF data from a test page. This is very much a proof of concept of extracting eRDF using Javascript but would be relatively simple to extend for the full FOAF spec once Dan or someone else extends Sumo! to support rel and rev.

On a related note the combination of Sumo! and the Firebug javascript console is just perfect. Anyone who hasn’t yet downloaded the latest version should get over there quick.

Firebug

Microformats and eRDF sitting in a tree

Following on from my previous post on eRDF I’ve started to play around with it. For anyone bored enought to have read the source of this site today you’ll have seen a couple of behind the scenes changes - specifically I’ve added a dash of FOAF.

The FOAF, or Friend of a Friend, project is:

creating a Web of machine-readable pages describing people, the links between them and the things they create and do.

I sort of see it as a bigger and more complicated older brother to XFN and hCard.

First things first, unlike Microformats, using eRDF needs a bit of setup outside just adding classes and attributes - specifically you need to add a profile to the head element of your document and then add some namespaces links into the head like so:

<code><head profile="http://purl.org/NET/erdf/profile">
  <link rel="schema.rdfs" href="http://www.w3.org/2000/01/rdf-schema#" />
  <link rel="schema.foaf" href="http://xmlns.com/foaf/0.1/" />
</head></code>

Then you’re onto the more familiar ground (if you’re used to Microformats) of adding semantic attributes. I already had the following snippet markup up in hCard, i just needed to add a few more classes (-foaf-Person, foaf-weblog and foaf-name) to existing elements.

<code><div id="iHead">
  <h2 class="vcard -foaf-Person" id="gareth">
  <a href="/" class="url org foaf-weblog">Morethanseven</a>
  <span>is where <a href="mailto:&#103;&#097;&#114;&#101;&#116;&#104;&#064;&#109;&#111;&#114;&#101;&#116;&#104;&#097;&#110;&#115;&#101;&#118;&#101;&#110;&#046;&#110;&#101;&#116;" 
    class="email fn foaf-name">Gareth Rushgrove</a> plays with the web
    </span></h2>
</div></code>

The full FOAF Specification details a huge range of other elements - everything from foaf:Project for making associations between yourself and projects you have worked on, to foaf:OnlineGamingAccount which is pretty self explanatory.

When parsed out that gives you a foaf document a little something like:

<code><?xml version="1.0" encoding="utf-8" standalone="no"?>
<rdf:RDF xmlns:admin="http://webns.net/mvcb/" 
  xmlns:doap="http://usefulinc.com/ns/doap#" 
  xmlns:dc="http://purl.org/dc/elements/1.1/" 
  xmlns:foaf="http://xmlns.com/foaf/0.1/" 
  xmlns:h="http://www.w3.org/1999/xhtml" 
  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
  <rdf:Description rdf:about="">
    <admin:generatorAgent 
      rdf:resource="http://purl.org/NET/erdf/extract"/>
  </rdf:Description>
  <rdf:Description rdf:about=""/>
  <rdf:Description rdf:about="#gareth">
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
    <ns_13:weblog xmlns:ns_13="http://xmlns.com/foaf/0.1/">
      Morethanseven
    </ns_13:weblog>
    <ns_14:name xmlns:ns_14="http://xmlns.com/foaf/0.1/">
      Gareth Rushgrove
    </ns_14:name>
  </rdf:Description>
  <rdf:Description rdf:about="http://morethanseven.net"/>
  <rdf:Description rdf:about="mailto:[email protected]"/>
</rdf:RDF></code>

So far I’d agree with Ben. All I’ve done is something people (including me) have been doing already with hCard. But I’ve done it in a way, using eRDF, that doesn’t in any way stop me continuing using Microformats.

Marking things up with Microformats for it’s own sake was only ever really of interest to markup junkies (including me again). It’s only when you can parse that information out that it gets more interesting. I’ve got a follow up post to this on just that subject involving some XSL, Javascript, PHP and some standing on the shoulder of giants. It will be left to a post after that when I try do something a little different.

On a related note; so far I’ve found only scattered and often overly technical and verbose documentation on eRDF and RDF in general. I’ve been adding links to del.icio.us as I find useful resources but I can see how useful Get Semantic has the potential to be if any of this is going to take off in the web standards community.

UK Web Design Meetups Map

I’ve posted a list before of events and meetups I know about around the UK but had got playing with the Google Maps API and decided to go one stage further and create a hopefully useful place to put all that info in the shape of a quick mashup.

Have a look over on morethanseven.net/presents/meetup for a hopefully useful tool. I’ll try and keep this up to date manually for the time being, so if you know of any other get togethers please leave a comment and I’ll add them to the list. Thanks to the excellent, if a little technical (well, it is Google) API documentation and Jeremy’s Adactio Austin for inspiration.

Sample map of UK Meetups

I’d like to make this even more flexible, ideally all I should have to do is point the page at the group urls and then parse out the geo-coded microformats data and then display it on the map and page, rather than duplicate this information and maintain it by hand. So if any of the owners of these sites want to geo-code up their sites and let me know that would be great. Some Upcoming integration might be on the cards too.

Hopefully it should be of some use to anyone looking for things going on in their area, or if you find yourself in a strange part of the country for whatever reasons then you’ll know how to find the local geek community without risking life and limb with an odd T-Shirt.