New Collection Development Policy
A Report on GBIF’s 29th Governing Board Meeting
and Associated BHL Meetings in Brussels, October 2022
The Journal of Research on the Lepidoptera
A Story of Pirate Publishers, ISSN Hijacking and Fraudulent DOI Assignment
The BHL Collections Committee has completed a comprehensive review and revision of BHL’s collection development policy. The new policy is written to provide recommendations and serve the collection management needs of BHL Partners. As a key objective of BHL’s Strategic Plan 2020-2025, the new policy is a significant revision to the original policy written in 2010. In learning together as a collaborative community of practice since then, our understanding of BHL collection management has been expanded and refined to:
Going forward, the collection development policy will be reviewed on a yearly basis for minor updates.
The Global Biodiversity Information Facility (GBIF) convened for the 29th Governing Board meeting (GB29) in a hybrid format in early October 2022. This was the first meeting with in-person attendance since the GB26 held in conjunction with the Biodiversity_Next meeting in Leiden (20-25 October 2019).
BHL Program Director Martin R. Kalfatovic attended the meeting in person in his role of BHL Node Manager as well as Alternate Representative for the United States (and Second Vice Chair of the GBIF Budget Committee). BHL Immediate Past Chair, Connie Rinaldo, attended virtually as Acting Head of Delegation, standing in for BHL Chair David Iggulden. Preceding the first day of official meetings was for GBIF committees to meet. During this time, I attended the GBIF Budget Committee meeting. The close of the meeting saw a toast to the outgoing chair, Peter Schalk (Netherlands).
The main meetings were held in the historic Art Nouveau building, the former Waucquez Warehouse, designed by Victor Horta, and now housing the Comics Art Museum. GBIF delegates were joined by large models of Tintin, Smurfs, and others.
It is with unbelievable sadness that we pass along the news that Constance Rinaldo, in so many ways the heart of the Biodiversity Heritage Library, has died after a rapid and sudden illness. After some glimmers of hope that she might recover, she passed away on Thursday, October 27th. Connie’s family expects celebrations of her life in Boston, New Hampshire, and Maine.
BHL Executive Committee Chair, David Iggulden (Royal Botanic Gardens, Kew), noted:
“Connie was a wonderful colleague and friend and her passion for all things BHL was highly infectious. I know that many of us will have very fond memories of attending various events with Connie or jointly presenting with her on BHL activities and developments. I personally have learnt so much from her during my time on the Executive and was always inspired by her drive to track down new opportunities for collaboration, development, or promotion of BHL.”
After her family and friends, BHL was Connie’s greatest passion and we are all better for that passion. As BHL Program Director since 2012, looking beyond daily administrative challenges is sometimes difficult. Connie was invaluable in reminding us all of the higher goals, the higher purpose, that BHL was committed to: BHL’s service to our library partners, to our global audience of researchers, and to making progress, however small, on the great challenges facing our planet and the organisms we share it with.
In 2017, The Journal of Research on the Lepidoptera published its final issue. The journal’s website was turned off and, to ensure ongoing access to the biodiversity knowledge contained within its articles, all volumes (1-49) were made freely available on BHL. The final editorial, entitled “JRL R.I.P.” was written by Rudolf H. T. Mattoni, the then President of the Lepidoptera Research Foundation. You can find it on BHL here.
Jump forward five years. On 4 January 2022, Scott Miller (@PNGmoths) tweeted about the sad passing of his friend Rudolf H. T. Mattoni (1927-2022). Miller’s tweet prompted the realisation (by Roderic Page) that a “bad actor” had resurrected the Journal of Research on the Lepidoptera website and had been using the journal’s title and ISSN to publish new articles. From 2018 onward, the fraudulent party published 262 articles in six issues across two volumes. These articles were not about lepidoptera (butterflies and moths); they covered a seemingly random array of topics, including economics, health, and business management.
Optical character recognition (OCR) plays a critical part in BHL’s contributions to the scientific community. OCR in and of itself is a remarkable achievement, converting images of typewritten text to computer-readable text with “pretty good” accuracy. OCR on handwritten text is an even greater challenge to address and is beyond the scope of the improvements discussed here. The scientific work that BHL supports demands the best accuracy that we can provide using available tools, and let’s be honest, available budgets.
Recently, our colleagues at the Internet Archive made the transition away from the ABBYY FineReader OCR software to the Tesseract Open Source OCR engine. Over the past year or more, the OCR team at the Internet Archive has adapted and fine-tuned Tesseract to their workflows. Our first impression is that Tesseract OCR is more than “pretty good” in its ability to identify text from the page images provided to it.
The downside to this is that the Internet Archive has rightfully chosen to not re-process all existing text content through the Tesseract OCR engine. This is a prohibitively expensive and time-consuming prospect given that they have 35 million text-based items and reprocessing them would take several years and use up resources that could otherwise be used for gathering new content.
However, in the interests of supporting the efforts of the BHL community, the BHL Tech Team is working with our Internet Archive partner to reprocess some of BHL’s oldest content with the newest available version of Tesseract OCR. We are currently in a testing phase, and this blog post details some of our early results.
For the first time, the Biodiversity Heritage Library held its annual meeting in conjunction with a major biodiversity/natural sciences organization. In June 2022, BHL joined with the Society for the Preservation of Natural History Collections (SPNHC) and the Natural Sciences Collections Society (NatSCA) to host our meetings in Edinburgh, Scotland. Recognizing that the global pandemic had forced so many meetings to turn to virtual, we were fortunate to have this 2022 meeting as a hybrid virtual/in-person event.
BHL’s existence depends on the financial support of its patrons. Help us keep this free resource alive!
The Biodiversity Heritage Library (BHL) is the world’s largest open access digital library for biodiversity literature and archives. Headquartered at the Smithsonian Libraries and Archives in Washington, D.C., BHL operates as a worldwide consortium of natural history, botanical, research, and national libraries working together to digitize the natural history literature held in their collections and make it freely available for open access as part of a global “biodiversity community.”
Sign up to receive the latest news, content highlights, and promotions.Subscribe Now
Subscribe to the blog RSS feed to stay up-to-date on all the latest BHL posts.Access RSS Feed