Export of titles & scientific names in BHL now available for download
A series of files is now available for download that will enable libraries and other data providers to identify digitized titles available within BHL.
This suite of files also includes metadata about each volume scanned, as well as information about the millions of scientific names that have been identified throughout the BHL corpus and the pages on which those names occur.
NOTE: These files represent a first cut at how we want to make data providers and libraries aware of the content within BHL. Yes, we will build services, including an OpenURL resolver, but for now our partners have asked for a low-barrier export that they can manipulate for their own specific uses. The files above are automatically generated from the BHL database on a monthly basis. The datestamp on the files themselves indicate when they were last generated.
If you are interested only in the titles we have digitized, and the items (“books” or “volumes”) for each title, you only need to download the (significantly smaller) files for the following tables:
- Download contents of Title table as a tab-delimited text file. (4MB+)
- Download contents of TitleIdentifier table as a tab-delimited text file. (400KB)
- Download contents of Item table as a tab-delimited text file. (3MB+)
The full .zip download is not for the faint of heart! It’s a monster file because it includes the export of the 27 million 36 million occurrences of scientific names (updated 3/13/2009) identified in the BHL corpus through indexing by TaxonFinder.
Finally, we are considering this version a “warts and all” export. Merging the contents of multiple library catalogues and streamlining the digitization process to avoid duplication are the biggest challenges we face in building BHL, and to be frank our metadata is far from pristine in these early stages of our project. We are building functionality that allows librarians at BHL institutions to curate these digital books in ways that make sense to both scientists and librarians and that accommodate the variety of ways in which historic works have been catalogued over time. It’s a challenge we’ve just begun to tackle, and we look forward to any and all feedback you care to provide.