Making the Best of Difficult Times: Accelerating the Transcription of William Brewster’s Writings During the COVID-19 Pandemic

This post is part of a series from the Ernst Mayr Library exploring the digitization and transcription of ornithologist William Brewster’s archival materials and the insights and scholarship made possible thanks to this work.

Many of us have searched for silver linings during the COVID-19 pandemic of 2020-2021. For many in the library and museum profession, one positive outcome of the mandatory transition to remote work has been the resurrection of some long-postponed projects. These are activities put on hold during normal times in deference to ever-proliferating, higher-priority onsite tasks. One such project in the Ernst Mayr Library and Archives (EMLA) of the Museum of Comparative Zoology (MCZ) at Harvard University is transcription of the digitized journals and diaries of William Brewster.

As discussed in Part One of this series, EMLA began digitization of its Brewster collection in 2012. In 2014, Library staff began transcription of Brewster’s digitized journals and diaries to produce a corpus of text files for the Purposeful Gaming and BHL project. As part of this project, EMLA was charged with leading an effort to select an appropriate open source transcription tool. At this time (early 2014), manuscript materials such as field notes and correspondence were beginning to appear in the Biodiversity Heritage Library (BHL) thanks to other grant-funded efforts among BHL members. Tools under consideration for Purposeful Gaming were thus also viewed as potential long-term transcription solutions for libraries contributing such materials to BHL.

After four months of research and testing, the grant partners selected DigiVol, a crowdsourcing transcription platform developed by the Australian Museum in collaboration with the Atlas of Living Australia, and EMLA staff set up a project space for the Library. One significant advantage of DigiVol is that it was developed specifically for the transcription of natural history materials such as specimen labels and field notes. Thus, there was the advantage of a pre-existing community of volunteers both interested in and adept at transcribing such items.

At the end of the Purposeful Gaming project (November 2015), volunteers and library staff had transcribed 12 volumes of Brewster’s journals and diaries, totaling 3,470 pages. That left 37,000 pages of Brewster’s writings in our collection awaiting transcription. Over the next four years, progress in the transcription work at EMLA was relatively slow at an average of 1,262 pages transcribed per year. During this time, EMLA had only a single, albeit enthusiastic, part-time library assistant who was able to dedicate one-third of her time or less to the project due to other higher-priority tasks.

On March 16, 2020, Harvard University staff began working remotely. Expecting a large influx of volunteers due to the quarantine in Australia, and anticipating an increase in staff availability at participating institutions, DigiVol administrators sent out a call for more materials to be uploaded for transcription. Working remotely, the author had the time available to regularly upload new volumes of Brewster’s diaries for transcription, respond to questions posted by volunteer transcribers, and validate completed transcriptions. Another EMLA staff member was able to offer several hours per week of her time as well. This was fortuitous as the volunteers were, indeed, ramping up their transcription activity, completing each diary volume in about two weeks time.

As the pandemic wore on, greater numbers of staff across the Harvard University library system were in need of work that could be done remotely. In response to this need, Harvard Library established a work share program whereby libraries in need of help could offer temporary assignments or projects to those in need of some additional work. By this time (September 2020), the volunteer transcribers were making so much progress that we were falling behind in the work of validating transcriptions. EMLA thus posted its project via the work share program, and we were fortunate to obtain the assistance of two library colleagues able to devote several hours each week to the work of transcription and validation.

With a total of 4 staff members and an average of 11 DigiVol volunteers (up from a previous average of 7) working on the project, more pages of Brewster’s writings were transcribed in 2020 than in any of the previous four years, as shown in the graphic below. There was a 242% increase in the number of pages transcribed over 2019, and a 165% increase over the average yearly number of pages transcribed from 2016 through 2019.

Bar graph showing number of pages transcribed per year, 2016-2020.

Number of pages transcribed per year, 2016-2020.

Transcriptions in the Biodiversity Heritage Library

The fruit of our transcribers’ labors began to materialize in 2018 when BHL developers introduced transcription import functionality to BHL. When member libraries began uploading transcribed pages of field notebooks and correspondence, the potential for access and discovery became readily apparent. Full-text indexing and searching of transcribed manuscript materials was now possible. For example, in researching the previous blog post about Robert Gilbert, the author was able to search for the name Gilbert in several transcribed volumes of Brewster’s journals. The result of such a search in Brewster’s 1898 journal is shown below.

Screenshot of the BHL book viewer with full text search results for "Gilbert" displayed.

Searching for the name Gilbert in the 1898 volume of Brewster’s Journals retrieves 42 references to Robert Gilbert in this volume. The search results are shown in the right-hand panel of the BHL book viewer.

Like most field notebooks, Brewster’s diaries and journals are replete with scientific names. In order for these to be recognized by BHL’s name-finding utility, Global Names Architecture (GNA), they must be carefully transcribed. Transcription tutorials for the Brewster project in DigiVol ask that scientific names be given extra care, expanding abbreviations and correcting misspellings where possible. Our team of volunteers and staff have done an exquisite job of accurately transcribing scientific names, sometimes even doing research when unsure of the correct spelling of a name. Some of the transcribers so enjoy delving into Brewster’s world that they frequently research people, localities, or unfamiliar terms used in his writings as well.

Screenshot of the BHL book viewer with scientific names found on the page.

A species observation list from Brewster’s 1900 Journal with transcription of species names and observations in the text panel on the right. Note the scientific names on the lower left, extracted from the transcription text by Global Names Architecture (GNA). Prior to addition of the transcription text, no names were found by GNA.

The end result is that the myriad of ornithological taxa mentioned in Brewster’s writings are being indexed in BHL. Thus a search across the full BHL repository for a specific taxon retrieves results from field notes and correspondence as well as the published literature. The resulting species bibliography will thus lead researchers to field observations, specimen collecting accounts, habitat and weather descriptions, and other content that may not be included in secondary sources.

Screenshot of a species bibliography in BHL for Chaetura pelagica.

A search for Chaetura pelagica (Chimney Swift) across the full BHL repository retrieves instances of this taxon in Brewster’s transcribed Journals as well as the published literature.

Because of the hard work of our volunteer transcribers, full transcriptions of thousands of pages of Brewster’s diaries, journals, and correspondence are now being added to BHL, making this content available for text analysis, data mining, digital scholarship, and the creation of new information. Connections with similar collections in other institutions will likely be discovered as more primary source material is digitized and transcribed. Brewster’s meticulous record of species occurrences, habitat, landscape ecology, and weather can help in addressing the current climate and biodiversity crises. His record of historical events and his account of the work of Robert Gilbert might be used in social justice or other humanitarian research efforts. The staff of the EMLA are deeply indebted to the team of volunteers and our two Harvard Library colleagues who have given so much of their time and energy to help unlock the riches of the Brewster collection for the benefit of all.

If you would like to help transcribe Brewster’s writings, there is still plenty of work to do. We will soon complete his diaries and journals after which we will begin transcribing his extensive collection of correspondence. If you are interested, please visit our project page.

head shot of a man with brown hair.
Written by

Joseph deVeer is the Project Manager and Museum Liaison at the Ernst Mayr Library, Museum of Comparative Zoology, Harvard University. He currently serves as the Member representative for the Ernst Mayr Library to BHL.