In the Spring of 2022, the BHL Cataloging and Metadata Committee investigated the possibility of harvesting persistent identifiers (PIDs) from Wikidata as part of the group’s longstanding project to disambiguate and deduplicate author records in the BHL database. The motivation behind this one-time experimental data harvest was to see if BHL could:
- Enhance BHL author records with additional PID data points;
- Improve the committee’s ability to disambiguate author names in the BHL database; and
- Respond to an outstanding user request from two of Wikimedia’s super star editors, Siobhan Leachman and Andy Mabbett, to expose BHL’s author data on BHL and include hyperlinks to other authoritative knowledge bases on the web.
In particular, Wikimedians wanted to see the Wikidata Q identifier exposed, providing a link to the corresponding creator item record in Wikidata.
There are multiple motivations for undertaking this work. By adding the BHL Creator ID to the corresponding Wikidata item, Wikidata editors help link BHL to the richer biographical data about that person held in Wikidata. The Wikidata item for a person may contain links to their Wikipedia page or to images of the person held in the image repository Wikimedia Commons. Wikidata items also act as identifier hubs and contain links to other databases and identifiers.
By adding the BHL Creator ID to this list of identifiers, the Wikidata editor is linking the content held in BHL to the content held in multiple other datasets and repositories.
These extra author data points provide Wikimedians and BHL catalogers with crucial clues that aid in name disambiguation. In particular, hyperlinks to other knowledge bases are incredibly valuable because they lead to new knowledge pathways that help confirm a person’s identity in a complex game of “Who’s Who?”