Blogs as data: using XQuery for content evaluation

Authors

  • Eli Wachter Michigan State University
  • Elizabeth A. Mullen National Library of Medicine

DOI:

https://doi.org/10.5195/jmla.2026.2338

Keywords:

History of Medicine, Data Visualization, Data Analysis, National Library of Medicine

Abstract

Circulating Now, the history of medicine blog for the National Library of Medicine (NLM), highlights blog posts written by community contributors. To evaluate the community represented within the blog, the project team explored how XQuery, a language for querying XML data, could be utilized in developing a dataset on institutions represented in the blog. The team used ChatGPT to develop the XQuery script and processed the queries through BaseX. The resulting data was transferred to Excel where additional data elements, such as geographic location and institutional type, were manually added. From this dataset, the team created visualizations in Tableau to show the over 400 unique institutions across the world represented. These visualizations supplemented an internal report for the Circulating Now Editorial Board, illustrating the current engagement reach of the blog and areas for future possible collaboration.

Author Biography

Elizabeth A. Mullen, National Library of Medicine

Managing Editor for Circulating Now

Downloads

Published

2026-04-13 — Updated on 2026-04-13

Versions

Issue

Section

Virtual Project