Dynamically generating T32 training documents using structured data


  • Paul James Albert Samuel J. Wood Library, Weill Cornell Medicine, New York, NY http://orcid.org/0000-0001-8220-272X
  • Ayesha Joshi Weill Cornell Graduate School, Weill Cornell Medicine, New York, NY




T32 Training Grants, National Institutes of Health, Author Disambiguation, Administrative Burden


Background: The US National Institutes of Health (NIH) funds academic institutions for training doctoral (PhD) students and postdoctoral fellows. These training grants, known as T32 grants, require schools to create, in a particular format, seven or eight Word documents describing the program and its participants. Weill Cornell Medicine aimed to use structured name and citation data to dynamically generate tables, thus saving administrators time.

Case Presentation: The author’s team collected identity and publication metadata from existing systems of record, including our student information system and previous T32 submissions. These data were fed into our ReCiter author disambiguation engine. Well-structured bibliographic metadata, including the rank of the target author, were output and stored in a MySQL database. We then ran a database query that output a Word extensible markup (XML) document according to NIH’s specifications. We generated the T32 training document using a query that ties faculty listed on a grant submission with publications that they and their mentees authored, bolding author names as required. Because our source data are well-structured and well-defined, the only parameter needed in the query is a single identifier for the grant itself. The open source code for producing this document is at http://dx.doi.org/10.5281/zenodo.2593545.

Conclusions: Manually writing a table for T32 grant submissions is a substantial administrative burden; some documents generated in this manner exceed 150 pages. Provided they have a source for structured identity and publication data, administrators can use the T32 Table Generator to readily output a table.


