Tantek Çelik has purchased socialsearchme.com for this service! Thanks!
Never tweet about something you don’t want to go public. I’ve been annoying my followers for some time now about my new social search engine. Tantek then linked to it from his WordCamp SanFrancisco presentation. Not that I’m upset at all. I’m ecstatic that he thought it was worth linking to! Still, a word to the cautious 😉
So how does this search engine work? What does it do? Basically, it’s an hCard search engine. Unlike the Yahoo or Technorati Kitchen implementations, however, this search is focused on social networking and profiles. If DiSo were Facebook, this could be the friend search functionality. So instead of having the results be links to pages that contain matching hCards, the results are profiles with social networking data (including contacts) and names, etc.
One other key thing that is different here from pure hCard search is that I am only spidering representative hCards (with some small hacks for well-known sites like Twitter). This means I don’t spider arbitrary hCard data, instead I am only indexing profile pages. I use both XFN parsing and the SGAPI to verify claims that two pages represent the same person, and then associate them. Data from both pages goes into the index as if it were all on one page. Only one page needs an hCard, since connections are made through rel=me and XFN. This way, although my profile is on my main page and my contacts are at singpolyma.net/contacts, the search engine indexes them both.
To find new pages to index, I spider along XFN (and FOAF, since I also ask the SGAPI) to find pages likely to have the sort of data I’m looking for. Interestingly enough, this means that social networks like Twitter, Pownce, and Digg, who support hCard and XFN, get almost completely indexed. There are over 100000 profiles in the index now, and I have only given it one manually : singpolyma.net.
I’m not entirely sure how the data will be useful yet, but I’m really excited about the possibilities. I firmly believe in making XFN lists, static though they may be, come alive with potential through layers of functionality, be in through plugins, 3rd party services, or bookmarklets.
Speaking of bookmarklets, I have one. Go to that page, add the bookmarklet, and visit my contacts page (or any other page with lots of XFN data). Click it and watch that boring list of links and names turn into a more functional social-networking list.
The code has been released under an MIT-style license on my repository. Front-end is PHP, back-end is Ruby.
DiSo : on our way to fixing your addressbook 😉
18 Responses
Chris Messina •
Sadness!
http://www.flickr.com/photos/factoryjoe/2778874073/
Still, great to see progress on this. I think I need to improve my blog… 😉
Stephen Paul Weber •
@Chris : http://www.flickr.com/photos/factoryjoe/2778874073/comment72157606840113491/
Tantek Çelik •
Great work Stephen!
I’ve linked to your search prototype from both my WordCamp 2008 presentation and my (mere hours ago completed) An Event Apart San Francisco presentation, and successfully demonstrated it during both sessions as well.
Also, I’ve added a link to your Diso Search Engine on the microformats “search engines” page on the wiki. Feel free to add more details!
http://microformats.org/wiki/search-engines
Well done.
Tantek
justin •
i’ve updated my twitter profile to better link my online identities… how long until it reindexes? i want to see if it worked 🙂
Stephen Paul Weber •
@justin your profile has been updated http://scrape.singpolyma.net/profile/person.php?id=179395 many of your profiles are still not showing up because the crawler cannot verify them. You need mutual rel=me (from main profile out, such as you have on bobthecow.info, but also vice-versa) so the crawler knows you’re not just claiming someone else’s profile.
justin •
What about websites and profiles that don’t let me specify a rel=me link? Is there any other way to create a relationship? Maybe a critical mass of rel=me links pointing to it, or that combined with an un-rel’d link back to a confirmed site?
Stephen Paul Weber •
@justin most major social websites support rel=me, a very few that don’t do support FOAF (which I get through SGAPI). The notable exceptions are Facebook and Myspace. Facebook there is nothing that can be done, since the public profile cannot be modified in any way. Perhaps in the future a Facebook app could be combined with OpenID login to let a user claim both profiles and thus associate them. Myspace, I believe, lets you insert arbitrary HTML in some fields, which could be a useful workaround. Is there a specific other service you’re having trouble with?
justin •
Facebook and MySpace were actually the two that came to mind 🙂
justin •
Can I say that I appreciate Profilactic a lot more right now? It makes a great place to add all my social sites, and it provides rel=”me” links to all of ’em…
A couple more sites came up in the process. Brightkite, Delicious and Get Satisfaction do not support rel=”me” or friend links. Brightkite says it’s in the works, and I’ve filed a Get Satisfaction ticket with all three, so hopefully they’ll join the club soon.
One more question: is a rel=me cycle sufficient? If A claims B, B claims C, and C claims A is that good enough?
Stephen Paul Weber •
@justin consider adding sites that support rel-me to the wiki and also adding sites that do not.
Yes, a cycle *should* be sufficient, although it may take more time for the crawler to verify such a cycle.
Stephen Paul Weber •
Oh, also, I consider class=”url” in the representative hCard of a page similar to rel-me, so this may be relevant.
justin •
Ahh. That must be why BrightKite profiles are working…
I’ve posted issues at Get Satisfaction for a couple more sites, asking that they add XFN support.
justin •
Intersting. Pownce’s vcard appears to have trumped the .fn on all my other pages, so my profile no longer shows up in a search for my full name.
In this case, Pownce labels my display name as h1.fn.nickname, which is somewhat accurate. But every other profile, page, or website indexed that has an .fn uses my full name. Thanks, Pownce 🙂
Stephen Paul Weber •
@justin actually, the crawler clobbers your fn every time it finds a site that claims to know it… I need to make that logic smarter…
justin •
Good to know. I’m a bit irritated with Pownce, since they actually know my full name, but still put in an abbreviation and called it my fn. grrr.
Recent Links Tagged With “diso” – JabberTags •
[…] public links >> diso DiSo Gets Search Saved by higeorange on Mon 08-12-2008 Iran: Mitondra fivavahana ambony diso fanantenana Saved by […]
Anonymous •
It doesn’t find me. Odin Hørthe Omdal, or Velmont. on http://identi.ca/ and http://identoo.com/ — Don’t you search those? Should really be first priority. 🙂
Stephen Paul Weber •
I have added your profile URLs to the queue 🙂