Analyzing University of Virginia Health publications using open data, Python, and Streamlit
Keywords:bibliometrics, open access, open data
As part of a larger project to understand the publishing choices of UVA Health authors and support open access publishing, a team from the Claude Moore Health Sciences Library analyzed an open data set from Europe PMC, which includes metadata from PubMed records. We used the Europe PMC REST API to search for articles published in 2017–2020 with “University of Virginia” in the author affiliation field. Subsequently, we parsed the JSON metadata in Python and used Streamlit to create a data visualization from our public GitHub repository. At present, this shows the relative proportions of open access versus subscription-only articles published by UVA Health authors. Although subscription services like Web of Science, Scopus, and Dimensions allow users to do similar analyses, we believe this is a novel approach to doing this type of bibliometric research with open data and open source tools.