Showing posts with label Wikipedia. Show all posts
Showing posts with label Wikipedia. Show all posts

Tuesday, March 25, 2008

Wikipedia on iSpecies


I've added snippets from Wikipedia to iSpecies results, in part inspired by FreeBase. This makes use of the XML export format . For example, the URL http://en.wikipedia.org / wiki / Special:Export / Luzon_Montane_Forest_Mouse returns XML, with the wiki markup enclosed in the tags <text xml:space="preserve"></text> I use some simple regular expressions to strip some of the markup out, including the taxobox, then I grab the first 100 words of the article to display on the iSpecies page (together with a link to the original article).

Because a species may have multiple names, we need to handle redirection. For example, the URL http://en.wikipedia.org / wiki / Special:Export / Apomys_datae returns
<text xml:space="preserve">#Redirect [[Luzon Montane Forest Mouse]]</text>

which tells us that the content is to be found at http://en.wikipedia.org / wiki / Special:Export / Luzon_Montane_Forest_Mouse.

There's still some polishing to do, but the Wikipedia snippets add something to the iSpecies results.