Defining data librarianship: a survey of competencies, skills, and training




Data Management, Data Science, Data Librarianship


Objectives: Many librarians are taking on new roles in research data services. However, the emerging field of data librarianship, including specific roles and competencies, has not been clearly established. This study aims to better define data librarianship by exploring the skills and knowledge that data librarians utilize and the training that they need to succeed.

Methods: Librarians who do data-related work were surveyed about their work and educational backgrounds and asked to rate the relevance of a set of data-related skills and knowledge to their work.

Results: Respondents considered a broad range of skills and knowledge important to their work, especially “soft skills” and personal characteristics, like communication skills and the ability to develop relationships with researchers. Traditional library skills like cataloging and collection development were considered less important. A cluster analysis of the responses revealed two types of data librarians: data generalists, who tend to provide data services across a variety of fields, and subject specialists, who tend to provide more specialized services to a distinct discipline.

Discussion: The findings of this study suggest that data librarians provide a broad range of services to their users and, therefore, need a variety of skills and expertise. Libraries hiring a data librarian may wish to consider whether their communities will be best served by a data generalist or a subject specialist and write their job postings accordingly. These findings also have implications for library schools, which could consider adjusting their curricula to better prepare their students for data librarian roles.

 This article has been approved for the Medical Library Association’s Independent Reading Program.


Piorun M, Kafel D, Leger-Hornby T, Najafi S, Martin E, Colombo P, LaPelle N. Teaching research data management: an undergraduate/graduate curriculum. J eSci Librariansh. 2012;1(1):46–50.

Shorish Y. Data information literacy and undergraduates: a critical competency. Coll Undergrad Libr. 2015 Jan 2;22(1):97–106.

Federer LM, Lu YL, Joubert DJ. Data literacy training needs of biomedical researchers. J Med Libr Assoc. 2016 Jan;104(1):52–7. DOI:

Surkis A, LaPolla FWZ, Contaxis N, Read KB. Data Day to Day: building a community of expertise to address data skills gaps in an academic medical center. J Med Libr Assoc. 2017 Apr;105(2):185–91. DOI:

Iwema CL, Ratajeski MA, Ketchum AM. Library support for data management plans. In: Federer L, ed. The Medical Library Association guide to data management for librarians. Lanham, MD: Rowman & Littlefield; 2016. p. 95–108.

Read KB, LaPolla FWZ, Tolea MI, Galvin JE, Surkis A. Improving data collection, documentation, and workflow in a dementia screening study. J Med Libr Assoc. 2017 Apr;105(2):160–6. DOI:

Medina-Smith A, Tryka KA, Silcox B, Hanisch RJ. Librarians and scientists partner to address data management. Digit Libr Perspect. 2016 Aug 8;32(3):142–52.

Delserone LM. At the watershed: preparing for research data management and stewardship at the University of Minnesota Libraries. Libr Trends. 2008;57(2):202–10.

Newton MP, Miller CC, Bracke MS. Librarian roles in institutional repository data set collecting: outcomes of a research library task force. Collect Manag. 2010 Dec 20;36(1):53–67.

Lage K, Losoff B, Maness J, Cragin MH. Receptivity to library involvement in scientific data curation: a case study at the University of Colorado Boulder. portal Libr Acad. 2011;11(4):915–37.

Primich T. A semester-long seminar in statistical visualization for undergraduates as taught by a science and engineering librarian. Sci Technol Libr. 2010 Aug 31;29(3):181–8.

Brandenburg M, Joque J. Contextualizing visualization in library services. In: Federer L, ed. The Medical Library Association guide to data management for librarians. Lanham, MD: Rowman & Littlefield; 2016. p. 139–50.

Brandenburg MD, Garcia-Milian R. Interinstitutional collaboration for end-user bioinformatics training: Cytoscape as a case study. J Med Libr Assoc. 2017 Apr;105(2):179–84. DOI:

Tenopir C, Birch B, Allard S. Academic libraries and research data services: current practices and plans for the future. Association of College & Research Libraries; 2012 Jun.

Yoon A, Schultz T. Research data management services in academic libraries in the US: a content analysis of libraries’ websites. Coll Res Libr. 2017 Nov;78(7):920–33.

Lyon L. The informatics transform: re-engineering libraries for the data decade. Int J Digit Curation. 2012;7(71):126–38.

Association of College & Research Libraries. ACRL proficiencies for assessment librarians and coordinators [Internet]. The Association; 2017 [cited 10 Apr 2017]. <>.

Cataloging Competencies Task Force. Core competencies for cataloging and metadata professional librarians [Internet]. Association of Library Collections & Technical Services; 2017 [cited 10 Apr 2017]. <>.

Medical Library Association. Professional competencies for health sciences librarians [Internet]. The Association; 2007 [cited 10 Apr 2017]. <>.

Tenopir C, Sandusky RJ, Allard S, Birch B. Research data management services in academic research libraries and perceptions of librarians. Libr Inf Sci Res. 2014 Apr;36(2):84–90.

Kerby EE. Research data services in veterinary medicine libraries. J Med Libr Assoc. 2016 Oct;104(4):305–8. DOI:

Ortiz-Repiso Jiménez V, Greenberg J, Calzada Prado FJ. Dialoging about data with the iSchools: exploring curricula trends [Internet]. Pre-referee version candidate pap. J Inf Sci. 2017 [cited 10 Apr 2017]. <>.

Huang M, Chang Y. Characteristics of research output in social sciences and humanities: from a research evaluation perspective. J Am Soc Inf Sci Technol. 2008 Sep;59(11):1819–28.

Wu L, Li P. What do they want? a content analysis of Medical Library Association reference job announcements, 2000–2005. J Med Libr Assoc. 2008 Oct;96(4):378–81. DOI:

Partridge H, Menzies V, Lee J, Munro C. The contemporary librarian: skills, knowledge and attributes required in a world of emerging technologies. Libr Inf Sci Res. 2010 Oct;32(4):265–71.

Schmidt B, Shearer K. Librarians’ competencies profile for research data management [Internet]. Joint Task Force on Librarians’ Competencies in Support of E­Research and Scholarly Communication; 2016 Jun [cited 10 Apr 2017]. <>.

Dawes JG. Do data characteristics change according to the number of scale points used? an experiment using 5 point, 7 point and 10 point scales. Int J Market Res. 2008 Feb;50(1):61–104.

Romesburg HC. Cluster analysis for researchers. NC: Lulu Press; 2004.

Gower JC. A general coefficient of similarity and some of its properties. Biometrics. 1971 Dec;27(4):857–71.

Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. Cluster: cluster analysis basics and extensions. R package version 2.0.6. 2017.

Becker D. Using categorical data with one hot encoding [Internet]. Kaggle; 2017 Sep 14 [updated 21 Jan 2018; cited 30 Jan 2018]. <>.

Kaufman L, Rousseeuw PJ. Partitioning around medoids (program PAM). In: Finding groups in data: an introduction to clustering analysis. Hoboken, NJ: John Wiley & Sons; 1990. p. 68–125.

Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987 Nov;20:53–65.

Johnson SC. Hierarchical clustering schemes. Psychometrika. 1967 Sep;32(3):241–54.

Jensen DG. Learn to read between the lines of a job ad. Science [Internet]. 9 Mar 2016 [cited 30 Apr 2017]. <>.

Barbrow S, Brush D, Goldman J. Research data management and services: resources for novice data librarians. Coll Res Lib News. 2017;78(5):274.

Cunningham L. The librarian as digital humanist: the collaborative role of the research library in digital humanities projects. Fac Inf Q. 2010;2(1):1–11.






Original Investigation