In the face of climate change and environmental challenges, understanding and documenting Earth’s biodiversity is essential. The Global Biodiversity Information Facility (GBIF) serves as a global repository for biodiversity data, playing a pivotal role in this critical mission of safeguarding our planet’s biodiversity. Species occurrence data sourced from the Biodiversity Heritage Library (BHL) provides insights into species distributions, behaviors, and interactions much deeper into time, offering key species baseline data required to effectively address the climate crisis. Without accurate and comprehensive data in GBIF, our collective ability to track environmental changes and make informed decisions is severely hampered.
As a GBIF participant node, BHL is committed to sharing biodiversity data openly, adhering to FAIR (Findable, Accessible, Interoperable, Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) data principles, and collaborating with a global network of biodiversity organizations to bolster and build capacity to strengthen the biodiversity information infrastructure. To honor our commitments, technical staff from BHL are working to establish a scalable data pipeline of occurrence data currently trapped in archival field notes, journals, letters, correspondence, and other primary source materials. The journey has been an arduous one due to poor OCR (optical character recognition) data quality for BHL’s sub-corpus of handwritten materials.