Friday, December 17, 2010

Yes, BHL has gone Global!

A summary from the 1st Global BHL Technical Meeting

by William Ulate, Global BHL Project Coordinator

September 22 to 24, in Woods Hole, Massachusetts, took place the Global BHL Technical Meeting, it was the very first time all signed and prospective BHL partners were going to be together at such meeting. There were representatives from all over the world, including Australia, Brazil, Egypt, Europe and the US; unfortunately our colleagues from China were unable to make it. We had a very productive meeting to know each other and present each other’s work in order to describe priorities and requirements for a Global BHL.

Through out this exchange participants achieved a high-level description of software and hardware components and were able to agree on milestones and deliverables for a global timeline, while at the same time, sketch the definition of global governance & policies for collaboration in the project.

On Wed. Sep. 22nd morning, after all participants had arrived and enjoyed a delicious breakfast (everyone recognized the Food Catering throughout the whole meeting was outstanding), a warm welcome from our hosts by Cathy Norton, MBL Director, followed by our BHL Director, Tom Garnett and our BHL Executive Committee Chair, Graham Higley, followed by a brief introduction from each participant, provided the perfect setting for a picturesque multimedia display of a brief Taking Measure of the Biodiversity Heritage Library: 2003- 2010 by Martin Kalfatovic, our BHL Deputy Director, and Chris Freeland, Global BHL Technical Director, talking about the BHL-US role in the Global BHL. The first section of the meeting was rounded up by Phil Cryer and Anthony Goddard presenting their lessons learned while setting up the Clustered and distributed Storage with commodity hardware and open source software to mirror BHL information.

After a comfort break each regional node was given the opportunity to share, before the rest of the group, the details of their specific projects, why and how it connects to the whole BHL and other projects, the work already done, the digitized content available or planned including dates of major milestones & deliverables, the resources available, their funding status, and their regional requirements, among other things.

The first partner to present was BHL Europe. Henning Scholz, Project Coordinator for BHL-E, gave an overview of their principles, objectives and partners, their work plan with dates for deliverables and how BHL Europe can integrate into different networks like this Global BHL initiative. Then, Melita Birthälmer, also from Museum für Naturkunde in Berlin, presented the project’s activities related to Content Management, starting with the available and planned numbers of volumes from different providers and the quality of that content and explaining in greater detail about the Global Reference Index to Biodiversity, GRIB, a bibliographic database with content management and deduplication functionalities being developed in collaboration with the EDIT project. Even when the GRIB is still in a prototype phase (see, it has been suggested as an option for a worldwide bibliographic database for a Global Biodiversity Heritage Library. Finally, Adrian Smales, from the Natural History Museum, talked about the technical implementation, dealing with topics like different metadata views and formats used by BHL-E content providers, their current infrastructure status, some considerations with an Open Archive Information System (OAIS) and a Preservation Archive System (PAS), the GRIB and a general Work Plan for the Technical implementation deliverables.

The next partner to present was Australia. Elycia Wallis from Museum Victoria showed the comprehensive work that the Atlas of Living Australia (ALA) has been carrying out and the context where the BHL-Australia (BHL-Au) project is being developed as one of its Rich Data Stores component projects (see presentation here) . Then she presented their worked starting at mid-2010 with the BHL-Au and BHL kickoff meetings at Museum Victoria in Melbourne and ALA HQ in Camberra. Then she explained the achievements setting up the infrastructure, including development and testing environments, assessing workflows to scan and adapting them to Australia conditions and developing a new user interface for BHL that should be ready by the end of 2010 with the existing functionality (test site can be accessed at Following the topics proposed, Ely talked about human and other resources they have available and then explained the plan and timing ahead. The mirroring, ingestion and uploading processes should be ready by March 2011. She also mentioned Australian copyright laws allows to scan documents up to 1955, but they might concentrate on particularly scanning rare books by mid 2011 and they feel confident they will be able to apply for further funds to perform maintenance after that. Additionally, Ely mentioned other very interesting projects going on to BHL-Au like supporting annotations, scanning field notes and correcting OCR through volunteers work (crowd sourcing).

Abel Packer, presented afterwards about the BHL Brazil/ BHL SciELO Network of national and thematic collections of quality journals, funded by the Federal Government and the state of Sao Paulo government, the research community and the libraries (see his presentation here). The network governance formalization should be in place by the end of 2010, start of 2011. Their slow but fully sustainable technical work has focused on procedures and criteria for content to be digitized and the open technology used in the portal development, OAI metadata exchange services implementation and VHL-provided search engine functions. SciELO Network had a Kick-off Workshop on Essential Rare Works Collection in Biodiversity on February 2010; in close collaboration with BHL Advisory Committee it plans to validate the selection criteria and choose the 200 first journal/ bulletins titles and books to scan. Its plans are to be operational and launched with 100 initial books by December 2010, expand to Latin America and the Caribbean countries starting in 2011 and have more than 2000 books scanned, digitized and exposed through BHL by 2013.

Finally, our colleague from Egypt, Dr. Noha Adly, presented their progress in Bibliotheca Alexandrina (, a "center of excellence in the production and dissemination of knowledge" (see her presentation here) whose objectives fit perfectly with BHL’s and has been involved in digital libraries and projects for quite some time now, developing technical infrastructure, long-established mass digitization and OCR workflows, mirroring Internet Archive and massive data sets, training specialists for their workflow, and more recently working with the Arabic version of Encyclopedia Of Life. Bibliotheca Alexandrina has come a long way since it started with 1 scanner in 2003, it now has 120 trained specialists working using their 10 scanners, 7 days a week on two shifts, digitizing and doing the OCR of 167,000 Arabic books, photos, negatives, slides and maps to include into joint projects like Description de L’Egypte and the World Digital Library. They have also developed their own projects like Digital Assets Repository (composed of Digital Assets Factory, Digital Assets Metadata using Fedora to manage only metadata, Digital Assets Keeper and Digital Assets Publishers) and the Science Supercourse, a PowerPoint repository for health, agriculture, environment and computer engineering. Bibliotheca Alexandrina is interested in becoming a BHL partner, holding a mirror site and working on infrastructure. It has also offered to organize our next Technical Global BHL Meeting, (which everyone happily took note of).

In the afternoon, after a comfort break, Bianca Crowley, BHL Collections Manager, presented the analysis of the BHL User Survey 2010. A total of 16 reusable questions were developed and for this first time, an average of 1020 successful responses per question were analysed, to understand how current user groups are using BHL services and what new development are groups expecting in the future.

Chris Freeland, BHL Technical Director, lead us in two interesting discussions about the Names finding process in BHL and what can be done to improve it, given an existing 35% error rate on the species names when the OCR is performed. Here, it was noted how two subprocesses are been carried out: the string finding and the name reconciliation. While some of the existing services take on both processes (UBIO, for example), it was concluded we have no mechanism in place to validate the 5.1 million names, so we should concentrate on working on OCR correction and let the specialists handle the name reconciliation. We will make random sample data available for potential partners, so nomenclators, for example, could provide feedback on ratio of good names.

Finally to round up the first day, the group reviewed the implications of the Global Open Access and BHL standpoint on it. It’s no secret for anyone that the global world of copyright is very complex. The group commented on the several copyright issues and distribution limitations will encounter in sharing materials globally. BHL is not assuming any copyright responsibility on its own, moreover BHL doesn't own any copyright. A small group of colleagues was defined to take all input about this topic and develop a suitable statement; taking into account that the user might get confused and frustrated if we end up with different categories of access and the system would have to be rebuilt to support it.

The second day the group was divided into an Administration subgroup, in charge of Policies & procedures needed for a global collaboration and another Technology subgroup to work on components needed for a Global BHL. The Administrative group was to deal with topics like Organization of each BHL node, Global BHL Collaboration and Governance and Communication Models for project leaders of each BHL node. On the other hand, the Technology group was set to discuss topics like the Content Ingest process in existing BHL, the Content Replication, making particular reference to preservation (LOCKSS) and mirror sites, the Localization, taking in consideration if he had to deal and share materials that couldn’t be openly distributed; and finally, the topic of Global Identifiers within the whole project.

Other more technical topics were covered during the rest of the meeting, from "Branding & Identity of the project itself" to funding opportunities, to data mining, OCR & Text correction experiences, and improvement of existing and new services required for integration at the APIs & User Interfaces levels. Even some birds-of-a-feather sessions on Content and Data Synch were included. Finally, a set of Action Items was sketched to follow up (see it here).

Wednesday, December 15, 2010

Book of the Week: Salad and...Snails?

Are those snails in your salad?

Apparently this is the question that our book of the week, The Field and Garden Vegetables of America, suggests that dinner guests will ask their host or hostess should Medicago obicularis be playfully added to the plate. However, rest assured, Medicago obicularis, more commonly known as Button Clover or Button Medick, will not threaten to ooze snail slime all over the salad greens. It is a harmless plant species of the genus Medicago found throughout the Mediterranean basin, though it can also now be found in southern parts of the United States. This "hardy, annual plant has reclining stems, compound or winged leaves, and yellow flowers." However, the "pods, or seed vessels," according to our book of the week, "are smooth, and coiled in a singular and remarkably regular manner," suggesting in their appearance the shell of a snail.

Interestingly, this species forms a symbiotic relationship with the bacterium Sinorhizobium medicae, which is capable of nitrogen fixation. Nitrogen fixation is the process by which "nitrogen in the atmosphere is converted to ammonia," a process crucial for life because "fixed nitrogen is required to biosynthesize the basic building blocks of life." In a study by the U.S. Department of Energy, "the amount of nitrogen fixed annually by the Sinorhizobium-Medicago symbioses is estimated to be worth $250 million."

So, as the holiday season approaches and you begin planning your extravagant dinner parties, consider throwing a little fun into the mix with Medicago obicularis. As our book tells us, the pods can be "placed on dishes of salad for the purpose of exciting curiosity, or for pleasantly surprising the guests at table."

Book of the Week: The field and garden vegetables of America: containing full descriptions of nearly eleven hundred species and varieties; with directions for propagation, culture, and use (1863), by Fearing Burr.

Wednesday, December 8, 2010

BHL in London!

It is no secret that the Biodiversity Heritage Library project has grown on a global scale, with BHL projects springing up in Europe, China, Australia, Brazil, and Egypt. Many of our new partners rely of the experience of BHL-US, as the original BHL project has come to be known, for insight and suggestions. One such partner is BHL-Europe, and a recent BHL-EU meeting in London proved to be a valuable opportunity to not only allow our European partners to gather and discuss various technical and workflow issues, but also to allow representatives from BHL-US to provide input based on our experience. With this intent, Bianca Crowley, Grace Costantino, and Chris Freeland, BHL-US staff, traveled to London for the latest BHL-Europe meeting on December 1-3, 2010.

The main subjects comprising this recent meeting were workflow, technical issues and portal development, and the Global References Index to Biodiversity, also known as the GRIB. The intent of the GRIB is to serve as a single point of access to all biodiversity bibliographic records held within the library catalogues of the BHL-EU partners, which will in turn link to digitized versions of the content. The GRIB will also function as a selection and de-duplication tool for all BHL partners, allowing institutions to indicate which items they wish to scan while also providing digitization status information.

A wish to contribute to the development of the GRIB by providing insight into the workflow management tools currently in use on the project was a main purpose of BHL-US staff’s involvement in the meetings. By explaining how staff currently use the many tools necessary for scanning workflow in the US, and what specific tasks staff need a master workflow management tool to address, BHL-EU was able to understand the requirements of their partners across the Atlantic, and development of the GRIB was positively impacted by the conversations.

The meetings were a productive exercise in communication and collaboration, and served as an excellent opportunity to get to know colleagues that, until the meeting, were known only through emails and occasional Skype calls. As BHL continues to expand around the world, it is hoped that such cooperation will continue, and that we will all work together to share our experience, learn from our collective mistakes, and in general provide a better digital library to the users that depend so heavily on our resources for access to the biodiversity literature of the world.

Thursday, December 2, 2010

Introducing BHL SciELO!

BHL "Classic" as we often refer to ourselves, is pleased to introduce our latest global partner, BHL SciELO, from Brazil. Launched December 1st, 2010 in a public event at the Museum of Zoology of the University of Sao Paulo, BHL Brazil is implemented by the Scientific Electronic Library (SciELO) Program. Sharing BHL's open access practices, they maintain an online multidisciplinary collection of quality journals which will total more than 700 titles by the end of this year.

The BHL-SciELO lauching event included scientific programming that shared information and experiences regarding challenges and next-steps for the BHL-SciELO Network Project. Abel L. Packer, Biodiversity Research Program Coordinater, spoke in depth about advancements already achieved and laid foundations for the next 3 years of work. Prominent Brazilian biodiversity related libraries will digitize their collections in the coming years according the principles, methodologies and procedures of the BHL and SciELO network.
Please join staff at BHL as we echo BHL Director Tom Garnett's sentiments in the video posted above in warm welcome to our newest partners! :)