Singpolyma

Technical Blog

µsearch: Search ALL the Microblogs!

Posted on

It’s really great that we’re slowly seeing multiple microblogging platforms crop up. One of the problems with federating all our data across the web, however, is that there’s no way to really track what’s going on. No way to search. If all you care about is Twitter you can use search.twitter.com, and identi.ca and others similarly have their own search for one site, but where is the central search option?

Well, it turns out that it’s really not that hard to build one. Using the powerful libre Xapian full text index solution and PubSubHubBub, a microblog search engine doesn’t even need a crawler.

I’ve launched an early version of this over at µsearch.singpolyma.net or the easier to type musearch.singpolyma.net. To seed content there is a script polling identi.ca and rstat.us periodically, until they implement a PSHB firehose, but any microblog site can get itself added by just adding PSHB feeds with the box on the homepage.

I make no promises at this point about the quality or longevity of the data in the search engine. Please report any issues or features you would like to see.

Of course the source is available under the ISC license. Feel free to submit patches!

Query Syntax

Just typing words will search all the metadata and content of posts for those words. Searching in quotes will search for a phrase. You can also prefix words or phrases with a field name to search only one bit of metadata. The currently indexed fields are:

  • content
  • category (like hashtags)
  • in_reply_to
  • bookmark (permalink)
  • author
  • to (mentioned users)

4 Responses

Stephen Paul Weber

@Julien Tumbler/Posterous are sort of on the line. Do they have firehoses? WordPress/Typepad are primarily *not* microblogs, and thus a bit out of scope, I feel.

Leave a Response