The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library
Continue reading
March 22, 2012byChris Freeland
The Art of Life: Data Mining and Crowdsourcing the Identification and Description of Natural History Illustrations from the Biodiversity Heritage Library
BHL has released a beta version of its OpenURL Resolver API for testing. A full description of the service is available at http://www.biodiversitylibrary.org/openurlhelp.aspx. Any repository containing citations to biodiversity literature can use this API to determine whether a given book, volume, article, and/or page is available online through BHL. The service supports both OpenURL 0.1 and OpenURL 1.0 query formats, and can return its response in JSON, XML, or HTML format, providing flexibility for data exchange.
BHL developers have incorporated the Internet Archive’s open source book viewing application into the BHL portal, providing a new interface for using BHL’s digital books.
Starting this past June, BHL worked with Qin Wei, a Ph.D. student in Library and Information Science at the University of Illinois Urbana-Champaign, to evaluate the taxonomic name finding software and algorithms used to identify scientific names throughout the BHL corpus. This work lead to some interesting findings, which were reported this week via poster and oral presentation at the Biodiversity Information Standards (TDWG) 2008 conference in Fremantle, Australia.
A series of files is now available for download that will enable libraries and other data providers to identify digitized titles available within BHL.
This suite of files also includes metadata about each volume scanned, as well as information about the millions of scientific names that have been identified throughout the BHL corpus and the pages on which those names occur.
Many researchers are used to searching or browsing for materials by article. Article level access to BHL content is a goal that we’re striving for, and one that we haven’t yet reached!BHL is a mass scanning operation. Our member libraries are moving as quickly as possible through a range of materials – books, serials, etc. – in order to scan as much as possible during our relatively brief window of funding. Our goal is to scan & cache now, then add in advanced technology solutions for secondary post-processing as they are developed.
Learn more about the BHL harvest process from Internet Archive.
BHL’s existence depends on the financial support of its patrons. Help us keep this free resource alive!
The Biodiversity Heritage Library (BHL) is the world’s largest open access digital library for biodiversity literature and archives. Headquartered at the Smithsonian Libraries and Archives in Washington, D.C., BHL operates as a worldwide consortium of natural history, botanical, research, and national libraries working together to digitize the natural history literature held in their collections and make it freely available for open access as part of a global “biodiversity community.”
Sign up to receive the latest news, content highlights, and promotions.
Subscribe NowSubscribe to the blog RSS feed to stay up-to-date on all the latest BHL posts.
Access RSS Feed