Thursday, May 18, 2017

DPLA Reharvest of BHL Data

On April 11, 2017 the Digital Public Library of America (DPLA) reharvested all BHL data for ingest into its portal at

While BHL has served as a content hub for DPLA since its launch in 2013, our data has not been updated in their portal since that launch, primarily due to the absence of a workflow on DPLA’s end for automatically harvesting new data. Since 2013, the number of BHL records in our portal has increased significantly and changes and corrections to pre-2013 records were not reflected in the DPLA portal. This new harvest not only captures new data but also ingests updates to existing records.

View of BHL records from DPLA's first harvest--these lack the thumbnail

Before the harvest, BHL had 123,472 items in DPLA. After the reharvest, BHL now has over 187,000 items in DPLA. This not only represents a 52% increase in BHL records in DPLA, but more importantly, the quality of those records has improved and is now in sync with BHL.

From the perspective of DPLA visitors, the most noticeable change is the addition of thumbnail images, which were lacking in DPLA prior to the reharvest. Going forward, DPLA will automatically reharvest BHL data on a bi-monthly schedule.

View of BHL records from DPLA's recent harvest, which includes thumbnails
Why is it important for our data to be in DPLA? BHL wants its data represented in DPLA because it supports our mission to make biodiversity literature as openly available and accessible as possible. DPLA exposes BHL content to new audiences who otherwise may not be aware of our existence and emphasizes the richness of U.S. national collections, which helps underscore the value of libraries for both American and global citizens.

You can explore BHL’s collection in DPLA and many others here.

Trish Rose-Sandler & Bianca Crowley

No comments:

Post a Comment