Crowdsourcing and BHL: Current Projects that Allow Users to Help Us Improve Our Library!

Recent crowdsourcing initiatives are revolutionizing scientific research, allowing the public to help scientists and researchers document, identify, and better understand biodiversity.

For example, the Atlas of Living Australia’s FieldData program allows anyone to contribute sightings, photos and observational data to help researchers and natural resource management groups collect and manage biodiversity data. Birds Australia is using this data to help record sightings of Carnaby’s Black-Cockatoo to inform conservation initiatives for this endangered species.

As another example, in 2013 a new mammal species, the olinguito (Bassaricyon neblina), was discovered in South America, the first carnivore species to be discovered in the Americas in 35 years. Scientists at the Smithsonian’s National Museum of Natural History are using citizen science-contributed observational data and photos to learn more about the new species.

BHL has taken advantage of crowdsourcing’s potential, implementing several initiatives to improve access to BHL images, support OCR correction and transcription, and generate semantic metadata for the BHL portal.

Art of Life: Improving Access to Images

The Art of Life project, funded by NEH and based at the Missouri Botanical Garden, has been making active progress on its objective of improving access to the natural history illustrations within BHL. The image-finding algorithms developed by the Indianapolis Museum of Art Lab have been run across 18 million BHL pages and you’ll now notice a significant increase in the number of pages tagged as having illustrations within the BHL portal. Pages with illustrations are currently being manually classified by volunteers as belonging to one or more image types: drawing, table, photograph, map, and/or bookplate. A few examples in the BHL portal include:

Next steps for the project are to crowdsource descriptions for the image’s content (e.g. subjects, dates, illustrator) through platforms such as Flickr and Wikimedia Commons. Learn more about the Art of Life project, which runs through April of 2015.

Learn how you can help tag BHL’s illustrations in Flickr.

On a related note, we will also be crowdsourcing BHL image descriptions through another platform, Zooniverse, the premier host for citizen science projects. This opportunity came about through a partnership with Constructing Scientific Communities (aka ConSciCom). More details will be forthcoming in a future blog post but expect to see BHL content available in Zooniverse in late spring or early summer of 2015.

Purposeful Gaming and BHL: Playing at OCR Correction and Transcriptions

Another crowdsourcing BHL project called Purposeful Gaming and BHL, funded by IMLS and based at the Missouri Botanical Garden, has been making significant strides in its objective to improve access to BHL texts through gamifying the text correction process. Digital outputs of BHL text are created both through automated (OCR of published text) and manual means (transcription of hand-written text from ornithologist William Brewster). Multiple outputs of the same page are then compared and differences incorporated into an online digital game in which the public will help verify the accuracy of individual words. Those corrections will then be incorporated back into the BHL portal for viewing by users and to enable full text searching. The project’s game designer, Tiltfactor, has recently completed and beta-tested 2 initial prototypes for both gaming and non-gaming audiences. The final games are expected to go live in May 2015. Learn more about the Purposeful Gaming project, which runs through November of 2015.

Help us transcribe William Brewster’s journals and diaries in the ALA/Australian Museum Biodiversity Volunteer Portal (click on any of the William Brewster projects listed) and FromThePage! Find guidelines for transcribing the documents here.

Mining Biodiversity: Semantics and the Crowd

In the near future, the Mining Biodiversity Project, whose USA partners’ participation is also funded by IMLS, will be crowdsourcing the creation of a gold standard annotated set of pages to train the mining algorithms that will search for named entities (ie. concepts like taxa, places, people, habitat, traits). After that, a bigger group of volunteers will help validating the pre-annotated relations through time (events) automatically discovered from our BHL corpus. Learn more about the Mining Biodiversity Project, which runs through December 2015.

The Field Book Project: Improving Access to Researchers’ Fieldnotes

The Smithsonian Field Book Project has been hard at work discovering and making accessible field book materials through cataloging, digitization, and online publication. Thanks to dedicated staff and volunteers, the Project has made huge strides in that direction. To date, 90 fieldbooks digitized by The Field Book Project have been ingested into BHL.

However, no celebration of success would be complete without a mention of the passionate “volunpeers” of the Smithsonian Transcription Center. Field books are often difficult to search and read due to their age and generally hand-written entries; pages may be faded and smudged, handwriting may be cramped or scribbled or stained by exposure to the elements, and the author may have used symbols and index marks that are foreign to modern readers. Unless a researcher knows exactly what they are looking for, they may be discouraged by the time and effort it takes to parse the archival text. Thankfully, the Smithsonian Transcription Center, which opened to the public on August 12th, has included the Field Book Project as one of its partners since the very beginning, allowing us to ask the crowd for assistance in conducting detailed readings and transcribing of field book content.

Each item that goes into the Transcription Center must first be transcribed and then reviewed by a volunpeer, who must create an account and can then access both the training documents on the site and also the rich community on the Center and on related social media platforms for questions and answers. Several of the volunpeers have become “super users,” transcribing and reviewing a large volume of material, and also serving as rich information sources for new transcribers on how to document tricky situations such as foreign characters, symbols, marginalia, and in field books in particular, the scientific names for observed flora and fauna.

At last count there were 84 items in the Transcription Center, 69 of which had been fully transcribed, an incredible resource for researchers and reference archivists alike! More fieldbooks are being added continuously and the eventual goal is to place those completed transcriptions into BHL alongside the original field books. The crowd of volunpeers has and is enabling the Field Book Project to offer more and better access to everyone from professional researchers to curious onlookers, and sometimes even leads researchers to information that may never have been discovered without the dedicated assistance of the volunpeers on the Smithsonian Transcription Center.

Become a volunpeer today and help us transcribe field notes to improve access to these valuable primary-source documents!

We Love Our Users!

With over 44 million pages of biodiversity literature, several million images, and a desire to continuously improve access to and discovery of these materials, leveraging the power of the crowd is a match made in heaven for BHL. With contributions from our users, we can ensure that our wealth of biodiversity information can continue to inspire discovery of the natural world. Thank you for your contributions, and if you’d like to know more about how you can contribute, send us feedback.

Trish Rose-Sandler, BHL Data Analyst, Missouri Botanical Garden
Julia Blase, Project Manager, The Field Book Project
Grace Costantino, BHL Outreach and Communication Manager

View Full Size Image

The Art of Life project is funded by the National Endowment for the Humanities (Grant number PW-51041-12).

View Full Size Image

Mining Biodiversity is funded in part by the Institute of Museum and Library Services (Grant number LG-00-14-0032-14).

Purposeful Gaming is funded by the Institute of Museum and Library Services (Grant number LG-05-13-0352-13).

Grace Costantino served as the Outreach and Communication Manager for the Biodiversity Heritage Library from 2014 to 2021. In this capacity, she developed and managed BHL's communication strategy, oversaw social media initiatives, and engaged with the public to excite audiences about the wealth of biodiversity heritage available in BHL. Prior to her role as Outreach and Communication Manager, Grace served as the Digital Collections Librarian for Smithsonian Libraries and as the Program Manager for BHL.

Trish Rose-Sandler is a Data Project Coordinator at the Center for Biodiversity Informatics (CBI) of the Missouri Botanical Garden.