I haven’t written about the XOXO microformat in some time, but some recent discussions caused me to dig into my archives to source a new post. Microformats tend to follow the rule of only formalizing the most common of existing publishing patterns (the 80-20), meaning that some more “edge case” data cannot be represented. Does this mean that this data is useless? Not at all: but it is outside the realm of microformats, at least for now. So we either need to invent something new, or extend what we have.
A Page from Recent History
This is not a new problem. Every formalised standard is going to face those who feel that their bit of metadata should be included. Take, as an example, the RSS 2.0 spec. Core essentials of news feeds are present: title, description, date, etc. Lots of metadata is missing though: author name, comment counts, comment feed URLs, ane more. People solved this problem in two very different ways: some extended, and some invented something new.
Extending RSS (or any XML format) is easy: create a namespace, add your elements, publish. If a particular piece of metadata is popular it gets standardised in a spec’ed extension (dc:creator, slash:comments, wfw:commentRss, etc). The benefit of this approach is that all existing parsers can still read your content. If a parser doesn’t need your extra metadata, it can safely ignore it and present just the core content. No new code needs to be written, and no new formats need to be learned for 80% of the applications.
There was another group interested in solving this problem: the ATOM group. They threw away all the existing formats (RSS 2.0 and RSS 1.0/RDF) and built something brand new from scratch to accomodate their data needs. What was the result? Feed aggregators everywhere had to write all-new code to handle this new, incompatible, and often more complicated case. Time and effort was wasted both in code and user education (unlearn “What is RSS” learn “What is ATOM” / “What are feeds”). Once the standard hit a spec’ed form, what happened? People began to use namespaces in ATOM as well, because for all the “better” it was, for some edge cases it just wasn’t “better” enough.
Back to XOXO
It seems the key is to be easily extendable, not to think of everything up front. If microformats are going to make their way into lots of APIs and not just be used for better page scraping (Ma.gnolia does a good job of this), then extensability is necessary. Fortunately, XOXO provides an easy solution. Check this out:
<ul>
<li class="vcard">
<dl>
<dt>fn</dt>
<dd class="fn">Martha</dd>
<dt>Anniversary</dt>
<dd>2005-02-04</dd>
</dl>
</li>
</ul>
An hCard parser can read that. For a normal use case, no new code is needed. An XOXO parser can read that, and if it knows about hCard will likely know what “fn” means. The other data is there, though. The parser has that data. Minimal new code, and all the data can be used. Cool or what?
I need a feed reader that recognises different reading preferences for different feeds.
Let me back up.
There are two kinds of feeds — feeds of content (blogs) and feeds of notifications (calendars, del.icio.us popular, digg, forums). The first kind you want to read all of — even if I miss checking my feeds for awhile I want to see what the blogs I read said. I won’t care tomorrow what was on del.icio.us popular today, there’ll be 100 new items!
Google Reader does a great job on the first kind of feed, holding content until I read it. Firefox’s LiveBookmarks does a good job on the second kind, showing only current content. What we need is a nice interface for both.
I’ve heard the new Bloglines might… perhaps I’ll check it out.
So if you read any blog besides mine (and surely you do) you’ve by now head of Yahoo’s Pipes application. Mashups without programming, and a team that’s promising more and better things to come.
One of the immediate uses to the Blogger community occurred to virtually all hackers at once. Sorting the feeds. This has never been a problem for me (I screen-scrape my feed via hAtom), but for others the fact that Blogger feeds sort by when they were updated is annoying.
Aditya suggested creating individual pipes, but I wrote a sorting pipe, as did Ramani (who beat me to blogging about it and has a nice how-to written). Ramani discovered an issue that causes this solution to be a bit buggy just yet. It has to do with ATOM being stupid and RSS 2.0 being cool (yes, that’s a partisan statement and not entirely true ). Basically the publishing date is not being copied from the ATOM format to the RSS format correctly. Vote on Ramani’s suggestion to get this fixed. I also discovered a less critical issue with the UI that may confuse some less geeky users. Please vote on my suggestion to get that fixed.
I also wrote a pipe for mixing together Google Calendars (for those of us who track events from more than one) into a nice, sorted feed of upcoming events. The email alerts system provided by Y! is dumb though, at least for this application. I want the next 5 events emailed to me every day… likely gonna have to write my own emailer for that…
Google Calendar‘s feeds are not the most useful in the world. They are sorted by edit date instead of when they’re happening, and they’re only available in ATOM.
No longer. Using the extended feed information Google tacks into their calendar feed it is possible to create a clean, sorted feed with the start date as the pubDate. Just go to the Google Calendar Feed Cleaner and enter the URL to your calendar’s feed, select the format you want (XHTML for testing, RSS 2.0, or JSON(P) ), and enter the max items to be in the results (default 5). You will get a nice, clean feed of your Google Calendar data.
For the longest time Blogger has been ATOM-only. This now seems to be a thing of the past. While I can find no official announcement anywhere, RSS 2.0 feeds seem to now be available on all Blogger blogs (unless the blog has not had the index published since whenever the change was). As you would expect, while the old was, say:
http://singpolyma-tech.blogspot.com/atom.xml
They have added to this:
http://singpolyma-tech.blogspot.com/rss.xml
It’s not the nicest implementation of RSS 2.0 I’ve seen, and I’m going to keep using my hAtom2RSS-through-feedburner feed for the comment options, but this is definately a step forward for Blogger.