28,000 Pages about the Sea: Challenges and Solutions for Digitizing the Fowler Collection
Workflow Meets Deluge
Henry Weed Fowler must have loved fish.
Ichthyology dominated his entire career. He started as a museum assistant at the Academy of Natural Sciences of Philadelphia in 1903. Other experts in his field soon recognized his prolific skill. In 1918, an assistant curator at the Smithsonian, Barton A. Bean, reached out to Fowler (then still an assistant) for help identifying fishes collected by the United States Exploring Expedition. Fowler dove into the work. He delivered a lengthy 750-page manuscript in two years, helping to discover 18 new species of fish in the process.
For reference, the average field book here at the Smithsonian Institution Archives is 110 pages.
In 1925, Bean again sought out Fowler (then an associate curator) to identify fish from a 1907-1910 expedition by the steamer Albatross in the Philippines. Over the next several years, the pair published three manuscripts together (one | two | three), after which Fowler published three more of his own (one | two | three). Then, Fowler additionally completed six massive manuscripts – thousands of pages of independently written material – that were never published. Fowler passed away in 1965 after becoming a Curator of Fishes in 1940. His unpublished manuscripts were transferred to the Smithsonian Institution Archives in 1975.
As a part of the BHL Field Notes Project, Smithsonian Institution Archives is digitizing one-of-a-kind field notes and manuscripts so they may be accessed online by researchers who may otherwise not be able to access the material. Logistically, each field book is assigned a unique identifier in the Archive’s Field Book Registry which we use to help track the item throughout the digitization process. On average, one unique identifier means one 110-page field book. We can normally digitize and deliver at least 30 field books per week.
Here’s where it got tricky: Fowler’s six unpublished manuscripts were grouped under a single identifier, MODSI6368. One 28,000-page item, 254-times larger than normal, and estimated 2.5 TB worth of images. One field book to outweigh them all.
The eight-Hollinger-box, 153-folder field book arrived in Staging in late February. Our team held a meeting. We needed a new approach.
Logistics of a Tidal Wave
Proceeding as normal would irreversibly skew project deliverables. Loading a 28,000 page manuscript in the BHL portal wouldn’t be the most useful approach for researchers. After detailed discussion, we settled on a plan: break out the manuscript into sub-elements at the folder level instead of the title level, photograph at ~650ppi so we wouldn’t have to adjust the camera during the day, and bulk scan smaller items that required a higher resolution.
This new approach offered several advantages. Adjusting the camera several times a day adds up and slows down imaging; removing the need to adjust constantly would save us time and make imaging the manuscript more efficient. Treating each folder as a separate sub-element would mean creating smaller, more manageable digital objects closer to the average 110 pages we usually work with. This would shorten descriptive metadata per image in our Collections Management System (CMS) records; otherwise we would be embedding page-long collection notes into each image. It would also allow us to take advantage of how BHL manages multi-volume titles. This in particular is a method already successfully implemented by our fellow Field Notes Project partners at the Missouri Botanical Garden on their George Engelmann material, so we knew it could be a user-friendly way to present the extensive Fowler manuscript.
We had a plan. Now we had to see if it would work.
Results after the Storm
The first Fowler image came through the camera on March 14, 2018. Our team finished photography on April 17th. Delivery to the Biodiversity Heritage Library is ongoing (stay tuned!). Here are a few statistics from the Fowler photography project:
- Final page count: 28,735
- 153 folders, average 187.8 pages/folder (1.7 times larger than average field books)
- 24 days of imaging and metadata processing
- Average 1197.29 images/day
- Highest one-day count of photography: 2,415 pages
- 345 pages/hour (5.75 pages/minute) for 7 hours by one technician
- 2.59 TB storage footprint
- Widespread sense of accomplishment
Spread the Word across the Seas!
Fowler has started to make his way online. As the team moves to another collection (Edmund Heller), check out some of the other worldwide journeys by Smithsonian researchers at the Smithsonian Transcription Center and the Smithsonian Field Books Collection. Thank you to our many volunteers, both online and here in the lab, for your passion and dedication in support of this project!
The BHL Field Notes Project is funded by the Council on Library and Information Resources (CLIR).
 “Henry Weed Fowler Papers,” Record Unit 007180, Collections Management System, Smithsonian Institution Archives