SHARE

Thursday, August 21, 2014

Avibase, The World Bird Database



As part of our BHL & Our Users series, we recently interviewed Denis Lepage, Senior Scientist at the National Data Center, Bird Studies Canada and creator of Avibase, an impressive online resource on the birds of the world.  With over 12 million records, the database covers information on about 10,000 species and 22,000 subspecies of birds, including distribution, taxonomy, and synonyms in several languages.

Over the last 20 years, Lepage has devoted his energy and inspiration to building and managing this extensive resource. Denis recently contacted us to share the role BHL has played in making his work possible and we're thrilled that he's agreed to share it here with you as well.  Enjoy!

BHL & Our Users: Denis Lepage

What is your area of interest?

My interest in taxonomy for Avibase is primarily a personal endeavor.  One of my goals for Avibase is to organize all bird taxonomy, whether current or historical, so we can track how our understanding of taxonomic concepts and scientific names has evolved over time. Since it began, the database has grown to about 12 million records, and I have developed various approaches to address some of taxonomy's unique challenges. These have recently been detailed in a paper published in the open access journal ZooKeys.

I am particularly interested in how the same scientific names are constantly being used for describing concepts that are actually representing different populations. Because of the rules of nomenclature, when a population is split into two or more species for instance, the original scientific name must remain with the one represented by the oldest specimen. Because of this, a scientific name may actually mean some very different things depending on who uses it or when it was used. The name Gallinula chloropus could mean a bird found in Europe, Asia, Africa, and Oceania but the birds found in the Americas could be alternatively called G. chloropus or G. galeata. If you naively looked at G. chloropus records on a global map, you would see a sudden and marked drop in sightings starting a few years ago in the Americas. You would probably eventually figure out that those have simply shifted to a different name, but the point is that this disorganizes things.


Screenshot of some of the reporting tools available in My Avibase


The problem is a lot more widespread than most people seem to realize, particularly as we start gathering vast amounts of data into large biodiversity inventories containing hundreds of millions of records. By my own estimate, less than half the world's species of birds have been represented by a stable and consistent name over the last hundred years or so. In some cases, such as with Puffinus Iherminieri, a single scientific name has been used to describe up to 18 different concepts, which really only share being represented by a common type specimen. In the broader sense, the name Puffinus Iherminieri can be used to represent a large complex of about 16 species according to today's understanding of taxonomy. Several of those are threatened species, so not being able to be precise about what a name means has real direct implications for things such as conservation. These difficulties are also compounded by several other challenges, such as name synonymy. While the rules of nomenclature were well designed to address the problems of changes in names and synonymy, they have completely ignored the issue of changes to the circumscription (what each name is used to represent).  The ZooKeys paper explains how I have addressed this problem in Avibase, and how this may be useful for other taxonomic groups.

In addition to that, Avibase also offers lots of resources for birdwatchers, including over 10,000 different checklists from all countries, states, and provinces, and many smaller islands around the world. Those are available in several taxonomies and providing common name synonyms in over 200 languages from Afrikaans to Zulu. There is an Avibase Flickr group to which nearly a thousand people contribute, and those are made available in the species pages, as well as in the form of illustrated checklists for any region of the world. I primarily designed Avibase with my own personal interests in mind as a birdwatcher, but I am very excited that so many people also find it useful. A relatively recent addition is the section called My Avibase, where people can maintain their life lists and generate cool reports that tell them where they can go next in their world adventures, and how many new species they can expect to see. By combining this with data from thousands of observers who contribute to eBird, I can also provide better estimates, for instance on which species they are most likely to find at different times of year, and more.

How long have you been in your field of study?

I started building the Avibase database just over 20 years ago, but it was launched as a web site around 2003. My original goal was mostly to create a personal database that contained all of the bird species and that I could use for tracking my own personal sightings. It then gradually evolved into something much bigger.

When did you first discover BHL?
A few years ago, through research for historical documents on archive.org and Google books.

What is your opinion of BHL and how has it impacted your research?

BHL provides access to historical documents on bird taxonomy that are often difficult for me to access otherwise, particularly outside of an academic environment. Being able to access digital copies of these documents at my leisure is extremely convenient.

How often do you use BHL?

Regularly, but this varies.  Over the last year, it has been several times per month.

How do you usually use BHL?

I generally download the whole PDFs for my own local use.

What are your favorite features / services on BHL?

Access to digitized copies of historical documents. Since I am not located in a university or a place where I can access a comprehensive library, this is invaluable. Even if I had such access, having these documents available online, with their content indexed and searchable from desk is incredibly convenient. For instance, I often come across old names that are no longer in use, and that I am trying to resolve their meaning.  Many of those names are only found in old historical documents (e.g., Hellmayr's Catalog of the Birds of the Americas), and having them simply available a Google search away is incredibly powerful.

If you could change one thing about BHL, what would it be, or what developmental aspect would you like the BHL team to focus on next?

Part of what I have been trying to do is convert the content of some of these documents into structured database pieces. This is very challenging in many respects. It assumes that the optical character recognition (OCR) is efficient and accurate, and that the information can be relatively easily parsed into their individual components. In several instances, I have found that the individual books that had been used for scanning had many marking made by hand, and which often disrupted the OCR process. I think that working on copies that are as clean as possible would be a desirable and reasonable objective. Improving the OCR process itself would also be incredibly valuable, but these books are probably more challenging than your average publication. They often rely heavily on abbreviations, symbols, highly stylized fonts and of course contain words such as scientific names that are not found in standard dictionaries.

If you had to choose one title/item in BHL that has most impacted your research, or one item that you prefer above any other in BHL, what would it be and why?

So far, undoubtedly the volumes from the Peter's Checklist of the Birds of the World. These represent a very important compendium of every bird in the world known at the time of publication and are used as the principal reference underlying most other subsequent global taxonomic effort. After a few years of manual labor by myself and a colleague, I am glad to say that we were able to convert major portions of the 16 volume series into a database that can be accessed openly here: http://avibase.bsc-eoc.org/peterschecklist.jsp. Efforts are underway to expand the database to include synonymy information that was also included in Peters' checklist.

Thank you, Denis, for sharing your work on Avibase and how you use the Biodiversity Heritage Library!

No comments: