Using Data to Improve BHL Social Media

How can data analysis help us improve our social media activities?

This was the fundamental question BHL’s Outreach and Communication Manager, Grace Costantino, sought to address during her two-day meeting with Ryerson University’s Social Media Lab, Sept. 29-30, 2014. As co-participants in the Mining Biodiversity Digging Into Data project, BHL has been collaborating virtually with Dr. Anatoliy Gruzd, the Lab’s Director, since early 2014. The Social Media and Society 2014 conference, organized by the Lab, offered an ideal opportunity for Ms. Costantino and Dr. Gruzd to meet in person and explore BHL’s social network using tools developed by the Lab.

Analyzing BHL’s Twitter Network via Netlytic

The Social Media Lab studies how social media and web 2.0 technologies affect society, communication, and information dissemination. The research team has developed several tools to analyze and visualize social networks. One such tool, Netlytic, is a free cloud-based app that uses public APIs to analyze social media conversations to help community managers discover trending topics, growth, conversation reach, and influential participants. Current supported datasources include Twitter, Facebook, groups and pages, Instagram posts and YouTube comments.

During the meeting, Ms. Costantino and Dr. Gruzd used Netlytic to analyze BHL’s Twitter network, capturing conversations originated by or occurring about BHL. Since analysis was just initiated, Twitter API limitations restricted data capture to conversation occurring on or after 9/22/2014.

Netlytic reports uncover popular words within conversations, which can be grouped to display popular topics. They also present the entire network as a visualized cluster that displays the connections and conversations among participants.

Keyword analysis revealed that, not surprisingly, “biodiversity” and “species” were two of the most popular words occurring in recent BHL-related conversations. Also not surprising were the terms “open access” and “smsociety14” (the hashtag for the Social Media and Society conference). Other popular words, however, include “beer,” “birds,” and “images.” Netlytic allows you to delve deeper into each word occurrence, listing the related tweets. An investigation of “beer” revealed that Oktoberfest is a popular topic among people following or discussing BHL content. “Images” confirmed what our observational analysis already told us – people love our illustrations (evidenced by the overwhelming success of Flickr).

The appearance of “birds” might suggest that we have more bird-lovers among our followers. However, review of the related tweets showed that we published a good deal of material about birds over the past week and a half, including our Birds of Paradise blog post, some tweets about Florence Bailey’s bird books, and a retweet of PenAndFeather’s post on BHL’s Album de Aves Amazonicas.

An analysis of the participants in our network revealed that 373 Twitter accounts (represented as nodes in the visualization) have posted content about us since 9/22/2014, with 858 “ties” (or links between participants, meaning that they engaged in some sort of interaction with each other about BHL, through either @mentions, @replies, or retweets).

A deeper analysis of the network uncovers who is talking to whom and what they’re talking about. The network visualization is based on clusters, and nodes within the cluster are linked to other nodes via lines that indicate interactions. A single line from one node to another represents a one-sided conversation, i.e. one account tweeted a message that includes a mention of the connected account. When nodes are looped together (a line connects one node, and in turn that node is connected back to the first with another line), this indicates dialogue between the nodes. One account mentions another, and that account in turn mentions the first. These lines can either have an indegree or outdegree centrality value. If an outdegree value, it means that the selected node (i.e. Twitter account) mentioned the connected node. Indegree indicates that the selected node was mentioned by the connected account.

For instance, BHL’s node has a strong indegree centrality, meaning that we are mentioned in other people’s tweets more often than we mention other people in a tweet. During this period, we were mentioned by 343 Twitter users. This is largely accounted for by retweets. Our node operates largely on a broadcasting model, where we publish information that is disseminated by others via retweets. This is obvious visually as the “BioDivLibrary” dot is larger than the surrounding dots in the image below, and in general all nodes tie back to ours. However, we mentioned 82 different Twitter accounts as well during this time period, indicating that we are engaging in dialogue with our users regularly.

The closer a node is to the more concentrated part of the cluster, the more connections it has within the network, or the more people within the network that the user is interacting with. Nodes stretching far outside the hub of the network often only have a single tie within the cluster, meaning they are not highly engaged in our community.

Netlytic is also useful for discovering conversations occurring about BHL that we might be unaware of. All of the nodes within the main group of clusters include either @BioDivLibrary, “Biodiversity Heritage Library,” or #BHLib in the tweet. These are conservations that we can easily track and follow via Twitter. However, the @BiodiversityNew network includes interactions among users that don’t include #BHLib, @BioDivLibrary, or “Biodiversity Heritage Library” in their tweets. Instead, they include links that resolve back to our blog posts. There is no easy way to discover this conversation in Twitter, but thanks to Netlytic, we can not only find it but engage with the participants in that node and encourage them to follow us.


So What?

A community manager can spend hours (days!) analyzing a network, discovering users and related conversations. But how can this help us improve our social media efforts?


  1. Trending vs. Long-Term Analysis: Since we’ve only recently started tracking BHL’s Twitter via Netlytic, we have a limited range of data available for analysis. However, the longer we track our network, the more data we will have to work with. This will allow us to get a more accurate picture of our network and answer questions that require longitudinal study and long-range data.
    1. Who are all the players interacting with us? Some users who might regularly engage with us may not have been active during the period analyzed, and thus are not captured as part of our network.
    2. What topics are consistently popular among our users? Terms like “beer” were popular now because of Oktoberfest, but likely will not remain so.
    3. What is the true size of our network, and is it showing growth? Furthermore, when we see rapid growth, we can investigate the content we published and the most active users in our network during that period to discover what and who is influential in regards to our growth.
    4. However, it is still useful to capture trends so that we can publish event-specific, time-relevant content. Netlytic allows you to limit your analysis to a particular time period, allowing us to see what topics we should be publishing about now (based on trends) vs. regularly (based on longitudinal study).
  2. Audience-Informed Content: In general, an analysis of popular keywords allows us to see what resonates most with our users. This will allow us to better craft our posts to satisfy the appetite of our network.
  3. Is our Community Stable and Loyal? Analyzing the users that engage with us over time can help us see not only whether we see new users participating (showing network growth), but whether existing users continue to interact with us. We want to grow a stable, dedicated community that consistently helps to promote our content. If we do not see strong follower loyalty, we will need to compare our content and tone to those popular among our users and tweak our strategies accordingly.
  4. Discover New Conversations and Communities: As explained earlier, Netlytic can help us discover conversations occurring about us that we are otherwise unaware of. We can also identify new communities with an interest in our material but as yet unengaged with our brand. Engaging in these conversations and networks will help us grow our audience.
  5. Discover Influencers and Collaborators: Network analysis can also help us see which users are most connected with other users, particularly with a strong inbound centrality. These users are reaching many other users and successfully inspiring engagement. These are important relationships for us to cultivate as these individuals may represent our most influential advocates. BHL also actively collaborates with other nodes. Network analysis will help us see whether these collaborations are reaching new and expanding audiences. We should increase collaborations with those demonstrating the greatest growth and connections. For those lacking such indicators, it may be appropriate to shift effort elsewhere.
  6. Convert Outliers to Champions: As mentioned, nodes furthest removed from a network’s concentrated center are least-engaged in the community but have demonstrated interest in our content. These users may also represent people with access to communities we do not yet engage with. We should strategically engage with select users in these categories to encourage further investment in our community, which may lead to advocates in new communities.
  7. Less Broadcasting, more Conversation: The Social Media and Society Conference revealed that science-related Twitter accounts often operate on a broadcasting model. They are disseminators of information, but not strong facilitators of dialogue. Within the BHL network, we see a bias towards broadcasting, but many looped ties were present, indicating that we are actively engaging with our users. One of our outreach goals is to foster and promote a dialogue among ourselves, researchers, librarians, citizen scientists, and bioinfomaticians. This requires us to engage with more users, using the strategies outlined above. Continual analysis of our network will reveal if we are being successful and indicate when new strategies are necessary.

It’s a brave new world of social media analysis. Emerging tools allow us to discover our networks, engage with our users, and refine our strategies like never before. As BHL continues to collaborate with The Social Media Lab and analyze our social networks (Twitter is just the beginning), we will improve our ability to tell people why biodiversity is unbelievably awesome!

View Full Size Image

The BHL Outreach and Communication Manager position, and this meeting attendance, are made possible in part through funding from the Institute of Museum and Library Services (Grant number LG-00-14-0032-14).

Avatar for Grace Costantino
Written by

Grace Costantino served as the Outreach and Communication Manager for the Biodiversity Heritage Library from 2014 to 2021. In this capacity, she developed and managed BHL's communication strategy, oversaw social media initiatives, and engaged with the public to excite audiences about the wealth of biodiversity heritage available in BHL. Prior to her role as Outreach and Communication Manager, Grace served as the Digital Collections Librarian for Smithsonian Libraries and as the Program Manager for BHL.