Singpolyma

Technical Blog

DiSo Gets Search

Posted on

Tantek Çelik has purchased socialsearchme.com for this service! Thanks!

Never tweet about something you don’t want to go public.  I’ve been annoying my followers for some time now about my new social search engine.  Tantek then linked to it from his WordCamp SanFrancisco presentation.  Not that I’m upset at all.  I’m ecstatic that he thought it was worth linking to!  Still, a word to the cautious 😉

So how does this search engine work? What does it do? Basically, it’s an hCard search engine.  Unlike the Yahoo or Technorati Kitchen implementations, however, this search is focused on social networking and profiles.  If DiSo were Facebook, this could be the friend search functionality.  So instead of having the results be links to pages that contain matching hCards, the results are profiles with social networking data (including contacts) and names, etc.

One other key thing that is different here from pure hCard search is that I am only spidering representative hCards (with some small hacks for well-known sites like Twitter).  This means I don’t spider arbitrary hCard data, instead I am only indexing profile pages.  I use both XFN parsing and the SGAPI to verify claims that two pages represent the same person, and then associate them.  Data from both pages goes into the index as if it were all on one page.  Only one page needs an hCard, since connections are made through rel=me and XFN.  This way, although my profile is on my main page and my contacts are at singpolyma.net/contacts, the search engine indexes them both.

To find new pages to index, I spider along XFN (and FOAF, since I also ask the SGAPI) to find pages likely to have the sort of data I’m looking for.  Interestingly enough, this means that social networks like Twitter, Pownce, and Digg, who support hCard and XFN, get almost completely indexed.  There are over 100000 profiles in the index now, and I have only given it one manually : singpolyma.net.

I’m not entirely sure how the data will be useful yet, but I’m really excited about the possibilities.  I firmly believe in making XFN lists, static though they may be, come alive with potential through layers of functionality, be in through plugins, 3rd party services, or bookmarklets.

Speaking of bookmarklets, I have one.  Go to that page, add the bookmarklet, and visit my contacts page (or any other page with lots of XFN data).  Click it and watch that boring list of links and names turn into a more functional social-networking list.

The code has been released under an MIT-style license on my repository.  Front-end is PHP, back-end is Ruby.

DiSo : on our way to fixing your addressbook 😉

18 Responses

justin

i’ve updated my twitter profile to better link my online identities… how long until it reindexes? i want to see if it worked 🙂

justin

What about websites and profiles that don’t let me specify a rel=me link? Is there any other way to create a relationship? Maybe a critical mass of rel=me links pointing to it, or that combined with an un-rel’d link back to a confirmed site?

Stephen Paul Weber

@justin most major social websites support rel=me, a very few that don’t do support FOAF (which I get through SGAPI). The notable exceptions are Facebook and Myspace. Facebook there is nothing that can be done, since the public profile cannot be modified in any way. Perhaps in the future a Facebook app could be combined with OpenID login to let a user claim both profiles and thus associate them. Myspace, I believe, lets you insert arbitrary HTML in some fields, which could be a useful workaround. Is there a specific other service you’re having trouble with?

justin

Can I say that I appreciate Profilactic a lot more right now? It makes a great place to add all my social sites, and it provides rel=”me” links to all of ’em…

A couple more sites came up in the process. Brightkite, Delicious and Get Satisfaction do not support rel=”me” or friend links. Brightkite says it’s in the works, and I’ve filed a Get Satisfaction ticket with all three, so hopefully they’ll join the club soon.

One more question: is a rel=me cycle sufficient? If A claims B, B claims C, and C claims A is that good enough?

justin

Ahh. That must be why BrightKite profiles are working…

I’ve posted issues at Get Satisfaction for a couple more sites, asking that they add XFN support.

justin

Intersting. Pownce’s vcard appears to have trumped the .fn on all my other pages, so my profile no longer shows up in a search for my full name.

In this case, Pownce labels my display name as h1.fn.nickname, which is somewhat accurate. But every other profile, page, or website indexed that has an .fn uses my full name. Thanks, Pownce 🙂

justin

Good to know. I’m a bit irritated with Pownce, since they actually know my full name, but still put in an abbreviation and called it my fn. grrr.

Leave a Response