DPLA Reharvest of BHL Data

View Full Size Image

On April 11, 2017 the Digital Public Library of America (DPLA) reharvested all BHL data for ingest into its portal at https://dp.la/.

While BHL has served as a content hub for DPLA since its launch in 2013, our data has not been updated in their portal since that launch, primarily due to the absence of a workflow on DPLA’s end for automatically harvesting new data. Since 2013, the number of BHL records in our portal has increased significantly and changes and corrections to pre-2013 records were not reflected in the DPLA portal. This new harvest not only captures new data but also ingests updates to existing records.

 

Before the harvest, BHL had 123,472 items in DPLA. After the reharvest, BHL now has over 187,000 items in DPLA. This not only represents a 52% increase in BHL records in DPLA, but more importantly, the quality of those records has improved and is now in sync with BHL.

From the perspective of DPLA visitors, the most noticeable change is the addition of thumbnail images, which were lacking in DPLA prior to the reharvest. Going forward, DPLA will automatically reharvest BHL data on a bi-monthly schedule.

Why is it important for our data to be in DPLA? BHL wants its data represented in DPLA because it supports our mission to make biodiversity literature as openly available and accessible as possible. DPLA exposes BHL content to new audiences who otherwise may not be aware of our existence and emphasizes the richness of U.S. national collections, which helps underscore the value of libraries for both American and global citizens.

You can explore BHL’s collection in DPLA and many others here.

Trish Rose-Sandler is a Data Project Coordinator at the Center for Biodiversity Informatics (CBI) of the Missouri Botanical Garden.

Bianca Crowley is the BHL Digital Collections Manager. In this capacity she leads digital collection management activities and coordinates communications across the project's international library consortium.