Tuesday, March 31, 2015

Citizen Science Uses Art to Unlock Scientific Knowledge

Citizen Science in Science Gossip

Since the release of Science Gossip a little less than a month ago, 3,600 volunteers have enthusiastically completed 160,000 classifications of natural history illustrations from the pages of 19th century science periodicals! As a result, the periodicals Recreative Science and Midland Naturalist are now fully classified and both the Magazine of Natural History and Journal of Zoology, Botany, Mineralogy, Geology and Meteorology and the Intellectual Observer are nearly complete (approximately 80%).

Volunteers have identified illustrations from a wide variety of topics, from Barnacles transforming into Geese to Egyptian Village Life to a plant called Vegetable Sheep, all of which demonstrate the diversity of domains covered in these 19th century science periodicals.

Barnacles transforming into Geese. Magazine of Natural History. v. 5 (1832).

Some of the illustrations volunteers have discovered relate to other Zooniverse projects such as these gems:

Egyptian Village Life. The Intellectual Observer. v. 7 (1865).

Furthermore, within the first week, one of the volunteers managed to uncover the background image we use for the Science Gossip website!

Talk has been very active with questions about the best way to classify. Based on regularly recurring questions from users we have begun an FAQ, and this list will grow over time. If you have a question you think should be added to the FAQ, please post here.

Vegetable Sheep! The Intellectual Observer. v. 11 (1867).

We are in the process of uploading new content and are looking to reduce the number of blank and text only pages that volunteers have to weed through to get to pages with illustrations. Algorithms that can help automatically identify pages with text are being tested, although they are not 100% accurate. Stay tuned for progress and we look forward to seeing what other illustrative treasures our volunteers will unearth over the next month!

Citizen Science in Flickr

We've also been encouraging our user community to help us unlock knowledge in scientific illustrations through Flickr tagging. In January, we announced that, in addition to the nearly 100,000 images in our own Flickr collection, over a million BHL images are also being uploaded to Flickr Commons via the Internet Archive Flickr stream. As part of our Art of Life project, we asked you to help us enhance these images by tagging them with species names.

Since our request in January, over 1,700 images in the BHL Flickr collection have been tagged by our user community, translating to over 18,000 total images tagged. Those images with species name machine tags are automatically ingested into the Encyclopedia of Life and associated with the corresponding species page. To date, there are over 17,000 BHL images in EOL.

Aster cordifolius, tagged by @SiobhanLeachman. Addisonia. v. 2 (1917).  

Our community has also been adding much more information than just species names. @SiobhanLeachman and @VLeachman shared a great guide from the British Library on additional machine tag formats, including artist name, dates, and VIAF information. As a result, knowledge such as the artists who created these amazing illustrations is also being captured.

Thanks to our interaction with taggers on social media, we've also discovered some really amazing things about these illustrations. For example, through transcription activities with the Smithsonian Transcription Center, @Bailiuchan discovered a great mention by Vernon Bailey of Cassin's Kingbird (Tyrannus vociferans), which she shared on Twitter. We shared an illustration of the bird from BHL, and a suggestion of the possible artist from one of our image taggers led us to discover that this image, and the other plates from this work, were prepared by the same firm that did Audubon's famous Birds of America!

Cassin's Kingbird. Plate prepared by same firm that did Birds of America, a fact we discovered thanks to our Flickr image tagging conversations on social media! Report on the United States and Mexican Boundary Survey. v. 2, pt. 2 (1859).

We are so excited about the ways that citizen science is allowing us to learn new things about our collections and capture this knowledge in ways that allow others to more easily discover them. The ultimate goal for all of these activities is to ingest the tags and descriptions provided by users on Science Gossip and Flickr into BHL to enhance our own metadata and eventually support image search within the portal itself.

Thank you for helping us learn about our collection and improve access to it! We hope you will continue to explore and describe our illustrations on Science Gossip and Flickr. What will you discover?

Trish Rose-Sandler
Data Analyst, Biodiversity Heritage Library, Missouri Botanical Garden
Grace Costantino
Outreach and Communication Manager, Biodiversity Heritage Library

Friday, March 27, 2015

What's Up with Seed Catalogs in BHL?

Cole's Garden Annual. 1892. From the BHL Seed and Nursery Catalog Collection.
We've spent a fun-filled week exploring the history, art, and science of gardening with our Garden Stories event. Seed and nursery catalogs and lists played a starring role in our campaign, allowing us to explore the world of gardening through the instruments that informed, documented, shaped, and transformed the industry.

As our journey this week has demonstrated, seed and nursery catalogs and lists allow us to trace the development of the seed industry, agriculture, and the home garden, documenting the rise, decline, and development of new plant varieties and prices; changing agricultural and printing technologies; the individuals who shaped the industry; the evolution of garden fashion and landscape design; the introduction of chemical agents for insect and weed control; early methods of cleaning, preserving, and shipping seeds; and cultural and social dynamics such as the effects of and reactions to scientific advancement, global wars, and the shifting roles of women in society and business.

Because of their cultural, historic, and scientific importance, many BHL partners are engaged in a variety of projects to digitize and improve access to the seed and nursery catalogs and lists in their collections. As we wind things down in our Garden Stories event, we invite you to explore the exciting world of seed catalogs in the Biodiversity Heritage Library consortium.

Digitizing One of the Largest Seed Catalog Collections in America

John Gardiner & Co. Seed Annual for 1890. From the NAL Seed Trade Catalog Collection.
Started in 1904 by USDA’s first economic botanist, Percy Leroy Ricker, the National Agricultural Library’s (NAL) Henry G. Gilbert Nursery and Seed Trade Catalog Collection consists of over 200,000 American and foreign catalogs. The earliest catalogs date from the late 1700s, but the collection is strongest from the 1890s to the present.

As one of NAL’s most frequently used collections with an appeal to a wide-ranging audience, the Nursery and Seed Trade Catalog Collection was a natural candidate for digitization. In 2013, NAL began digitizing the collection with Internet Archive which operates a scanning center at NAL. As of February 2015, NAL has cataloged all of its U.S seed catalogs through 1923 and digitized over 13,000 seed catalogs, including all of its U.S. catalogs through 1906 as well as its entire collection of catalogs from long-established firms such as Peter Henderson & Co., and woman-owned firms such as Miss Ella V. Baines. The Nursery and Seed Trade Catalog Collection will remain a focus of NAL’s digitization work with Internet Archive for the foreseeable future.

Soon after NAL’s catalogs became available in Internet Archive, BHL added them to its own Seed and Nursery Catalogs Collection. In 2014, NAL formally became a BHL affiliate and the two institutions began working on a standardized process for ingest of NAL seed catalogs (and other relevant digitized materials) into BHL.

Digitizing to Improve Access and Discoverability

In 2013, the Biodiversity Heritage Library engaged in an ambitious project to explore the applicability of purposeful gaming to tackle a significant challenge for digital libraries today: poor output from Optical Character Recognition (OCR) software. OCR allows a computer to "read" the text on a digitized page and produce a searchable text file for each page image that allows users to more easily discover content relevant to their needs.

Led by the Missouri Botanical Garden's Center for Biodiversity Informatics (CBI) and in partnership with Harvard University, Cornell University, and the New York Botanical Garden, the Institute of Museum and Library Services (IMLS)-funded project, Purposeful Gaming, will demonstrate whether or not digital games are a successful tool for analyzing and improving outputs from OCR and transcription activities because large numbers of users can be harnessed quickly and efficiently to focus on the review and correction of particularly problematic words by being presented the task as a game.

As part of Purposeful Gaming, project participants are digitizing seed and nursery catalogs and lists because these documents are great examples of materials that are notoriously difficult subjects for OCR to parse. The picturesque fonts and elaborate page layouts so endearingly characteristic of seed catalogs cause the resulting OCR output to be error prone and less than optimal. By identifying unique catalogs and lists in their collections and integrating them into the BHL Seed and Nursery Catalog Collection, transcription sites, and purposeful games, participating institutions are helping us enhance our OCR and improve access to not only these catalogs and lists but the entire BHL collection as well.

Seed catalogs and Index Semina – What’s the difference and why do we care?

Nierembergia frutescens in Seed List.
As described, Purposeful Gaming involves digitization of historic seed catalogs and seed lists, or index semina. What is the difference? Beautifully illustrated seed catalogues were issued regularly by seed companies to list their current selection available for sale. The catalogues occasionally included plants that were not only new to the garden, but also new to science. Similarly, the far less colorful seed lists were issued and exchanged by botanical gardens to facilitate the free exchange of new seed acquisitions and also included plant species new to science.

The seed lists were published and circulated in limited numbers and were often considered ephemeral so they were not generally deposited in libraries. Today no library in the world has a complete set. BHL partners have joined forces to digitize their collections to form a virtual set that is nearly complete and are far more accessible to botanists and everyone around the world.

Nierembergia frutescens is just one example of a plant that was first named and described in a seed exchange list. This beautiful flowering herb is a member of the Solanaceae or Nightshade family. It was first named in 1866 by the French botanist Michel Charles Durieu de Maisonneue. He described the plant in great detail while advertising the availability of seeds of this new species to his colleagues in Catalogue des graines récoltées en. 1866, issued by Jardin-des-plantes de la ville de Bordeaux.

Identifying the Unique

1904 Vick's Catalog. From LH Bailey Hortorium’s Horticultural Catalog Collection at Cornell University.
With many Purposeful Gaming-affiliated institutions involved in digitizing seed and nursery catalogs, alongside the significant digitization underway at the National Agricultural Library (NAL), it can be difficult to ensure that the same catalog is not digitized by multiple libraries. Recognizing this challenge, Cornell University’s Mann Library developed a process to identify and digitize the unique seed catalogs in their collection.

The first step in this process was to collect metadata about the seed catalogs held by BHL institutions currently digitizing these works. This was done using Excel spreadsheets with matching columns. The merging of the metadata was complicated by differing cataloging processes among various institutions. For instance, NAL catalogs each seed catalog as a monograph whereas Cornell catalogs them as serials based on the firm name. To further complicate the situation, Cornell also cataloged the firms for which they have only a handful of catalogs in alphabetic ranges by firm name.

After the metadata for the seed catalogs in applicable institutions was merged into one large spreadsheet, it was necessary to try to match up the varying firm names. For example, one institution might have a firm cataloged as John W. Adams, whereas another institution may have JW Adams, Adams JW, or even John W Adams & Sons. Additionally many firms changed names over time. 

Mann decided to use Google Refine in an attempt to standardize firm names. Using Google Refine’s Cluster option they were able to match up various firm names used for the same firm. (For more on clustering methods, please see If requested, Google Refine will change the firm names in the spreadsheet to use a common firm name for each. This allowed the resulting spreadsheet to be sorted by firm so that all metadata for one firm appeared together in the spreadsheet. Mann Library then reviewed the spreadsheet manually to see what firms or seed catalog publication years are uniquely held by them, avoiding the scanning of material already digitized by other BHL institutions.

Gaming to Enhance Collections

So we've covered digitizing the catalog and list collections. How then will Purposeful Gaming use video games to decipher difficult-to-read texts--such as seed and nursery catalogs--that cannot easily be read by OCR software?

Here's how it works: an original catalog is scanned, and the image uploaded to BHL. The image is then uploaded to a transcription portal, where volunteers type out the text that would be too difficult for a computer to read (thanks to all who have helped us transcribe seed catalogs this week!). Multiple transcriptions of the same text are then incorporated into the video game, which identifies discrepancies between them. The task of the player is to correctly transcribe the text in question through a creative video game interface. Eventually, games like this could help create searchable versions of seed and nursery catalogs, increasing their value to historians and horticulturalists alike.

Beta versions of the games are being tested right now, and we hope to release them this summer. Stay tuned to our blog and social media for more updates. In the meantime, you can help inform the game development by transcribing seed and nursery catalogs today! Learn more.

We're so glad you joined us this week for Garden Stories. You can explore all our great posts by following #BHLinbloom on Twitter and Facebook and diving into our Garden Stories blog series. Be sure to check out the over 14,000 seed and nursery catalogs in BHL and enjoy over 2,500 seed catalog images in Flickr, with a selection also available in Pinterest. Find great online gardening resources, facts, and tips on the BHL Gardening Resources Page.

Happy gardening!

Contributions From:

Chris Cole
Manager, Business Development, National Agricultural Library
Doug Holland
Director, Missouri Botanical Garden Library
Holly Mistlebauer
Information Technology Project Manager, Cornell University's Mann Library
Kay Derr
Contracts Librarian, National Agricultural Library
Patrick Randall
Marketing Intern, Ernst Mayr Library, Museum of Comparative Zoology, Harvard University
Judith A. Warnement
Librarian of Harvard University Botany Libraries

“'Tis A Gift To Be Simple” But to Have a Splendid Garden Buy Shaker Seeds

Sabbathday Lake Shaker Community Meetinghouse  (photo by Gerda Peterich for the Historic American Buildings Survey, Library of Congress Prints and Photographs Division; HAB SME,3-SAB,1—1)

The United Society of Believers in Christ’s Second Appearing, a religious sect commonly referred to as the Shakers, was founded in 18th-century England from a branch of the Quakers. Along with other newly formed devotional groups, they soon immigrated to colonial America. There they established as their economic foundation a variety of cottage industries that thrived throughout the 19th and into the early 20th centuries. Now known mostly for wonderfully simple architecture, austere but beautifully designed furniture and such functional objects as nesting oval boxes and baskets, members of the Shaker communities also once had booming garden and seed businesses. Their labor-intensive methods of working inevitably came to be overwhelmed by competition from industrial manufacturing as well as the problem of ever-shrinking membership, celibacy being one of their central tenets.  

Shaker Carrier (by Sharon Dugan; Smithsonian American Art Museum, Gift of Martha G. Ware and Steven R. Cole; no. 2011.47.11)

The first fully organized Shaker community was New Lebanon in New York; by 1789, two years after its formation, the members cultivated seeds as a cash crop. This was the second earliest “company” to sell seeds in America. David Landreth & Son of Philadelphia was the first, in 1784, and that business traded back and forth with the Shakers. Another early dealer was Grant Thorburn in Philadelphia who wrote about the difficulty of obtaining seeds of good quality.

Other Shaker settlements, or “families,” soon followed in the seed-production business, including Watervvliet, New York; Hancock, Massachusetts; Enfield, Connecticut; Canterbury, New Hampshire; Alfred, Maine; and South Union, Kentucky, among several others. All were known for the excellence of their products and integrity of their business dealings. Devotion to God was expressed through their work ethic, hand labor and craftsmanship. The selling of goods was meant to raise funds only for needs the Shakers could not supply for themselves.

It is uncertain whether the Shakers were the first to put their seed products in small paper envelopes or packets, but they were the first to popularize their use. The Shaker Brothers printed millions of them on their presses, while the Sisters performed the labor of cutting, folding and pasting the papers. While believing in partnership in work and equality of the sexes, occupations tended to fall along traditional gender lines. The Brothers tilled the fields while the Sisters picked, sorted and packaged the seeds and herbs.  

Seed Room, Shakertown, Kentucky (LC Prints & Photographs Division; HABS,84-SHAKT,2-33)

The Shakers were also early adopters of incorporating gardening manuals in their catalogs. Charles Crossman wrote ones in 1835 and 1836 for Mount Lebanon, Pittsfield and Watervilet, directing the wholesaler or the individual purchaser how best to plant the offered seeds. Instructions for storing and cooking bounties from the gardens were also sometimes given. Vegetables dominated the products but herbs and flowers as well as grass also appear in the lists. Crossman later left the Shakers and established his own company in Rochester, New York, which was just beginning to become the thriving horticultural center in the country.   

Wholesale price list of seeds by the Shaker Seed Company. New Lebanon, New York, 1888 (Henry G. Gilbert Nursery and Seed Trade Catalog Collection, National Agriculture Library; image from the Biodiversity Heritage Library)
As the Shaker seed business in the United States grew from 1840, printing firms outside of the communities were employed to keep up with demand. Correspondingly, advertisements and the design of the packets, begun first plain with simple chunky block type, were transformed to include the use of color papers and borders and eventually vivid chromolithography.  

Following the Civil War, increased competition resulted in the Shakers’ using the word “genuine” in their advertising, as products from the communities were considered to be of the highest quality, their best selling point. They also used various and effective marketing techniques, including seed boxes meant to be displayed on the counters of country stores. The containers had compartments for assorted Shaker vegetable and flower seed packages, with a hinged lid to be propped open to display a broadside detailing the contents within, and a chromolithographic image of a bountiful harvest, promising the purchaser a similar reward.
Seed Room with display containers, Shaker Centre Family Dwelling House, Shakertown, Mercer County, Kentucky (Library of Congress Prints and Photographs Division; HABS KY,84-SHAKT,2—34)

Other surviving ephemera record the once-thriving industry of seed, herbal, food, and medicinal products from the Shakers. Along with seed catalogs and the packets, mailing tags and envelopes, broadsides, receipts, invoices, billheads, and labels, give testament to their industry. Product labels include currant, grape, wild cherry, and blackberry wines. Far from being abstinent (except in the matter of sex), the Shakers made at least fourteen varieties of wine as well as distilled spirits. The religious group in their various communities was also at the forefront of selling medicinal herbs in the United States.

Shaker Digestive Cordial (A.J. White, New York, New York; National Museum of American History, no. 246707)

The changing world and market forces caused the Shaker seed industry to gradually peter out at the beginning of the 20th century. Increasing industrialization, growth of urban centers with ease of shipping and mailing to formally isolated villages, and overwhelming production from big companies, particularly in Philadelphia and Rochester, eliminated the communities’ wagon and sleigh seed routes for trading. With their firm spiritual beliefs and practices, the Shakers were not very competitive against aggressive business tactics.  

Postcard of the Shaker Village of Sabbathday Lake, Maine (ca. 1920; Wikimedia Commons)

At their peak in the mid-19th century there were about 6,000 “Shaking Quakers” so named because of their ecstatic behavior during worship services. Today, Sabbathday Lake Shaker Community, founded in 1783 in Maine, is the sole active survivor of the religion. A tree farm, an apple orchard and vegetable gardens are maintained, and hand-set printing is still practiced. The library there holds ephemera such as account records, recipes, catalogs and 1,380 labels printed by Shakers for their own products. They are living testament to the Shakers’ influence on the American seed and nursery industry and their permanent place in horticultural history—the early development of the new business, innovative techniques in marketing and education as well as the propagation and introduction of new varieties. The Sabbathday Shakers still sell herbs, and their catalog is online.

The Library of Sabbathday Village in New Gloucester, Maine (Wikimedia Commons, August 2010)
Wholesale price list of seeds by the Shaker Seed Company, Mount Lebanon, New York, 1888 (Henry G. Gilbert Nursery and Seed Trade Catalog Collection, National Agriculture Library)

More Garden Stories Fun

Julia Blakely
Smithsonian Libraries, Resource Description Special Collections Cataloger