Performance of gender detection tools: a comparative study of name-to-gender inference services
Keywords:accuracy, gender detection, misclassification, name, name-to-gender, performance
Objective: To evaluate the performance of gender detection tools that allow the uploading of files (e.g., Excel or CSV files) containing first names, are usable by researchers without advanced computer skills, and are at least partially free of charge.
Methods: The study was conducted using four physician datasets (total number of physicians: 6,131; 50.3% female) from Switzerland, a multilingual country. Four gender detection tools met the inclusion criteria: three partially free (Gender API, NamSor, and genderize.io) and one completely free (Wiki-Gendersort). For each tool, we recorded the number of correct classifications (i.e., correct gender assigned to a name), misclassifications (i.e., wrong gender assigned to a name), and nonclassifications (i.e., no gender assigned). We computed three metrics: the proportion of misclassifications excluding nonclassifications (errorCodedWithoutNA), the proportion of nonclassifications (naCoded), and the proportion of misclassifications and nonclassifications (errorCoded).
Results: The proportion of misclassifications was low for all four gender detection tools (errorCodedWithoutNA between 1.5 and 2.2%). By contrast, the proportion of unrecognized names (naCoded) varied: 0% for NamSor, 0.3% for Gender API, 4.5% for Wiki-Gendersort, and 16.4% for genderize.io. Using errorCoded, which penalizes both types of error equally, we obtained the following results: Gender API 1.8%, NamSor 2.0%, Wiki-Gendersort 6.6%, and genderize.io 17.7%.
Conclusions: Gender API and NamSor were the most accurate tools. Genderize.io led to a high number of nonclassifications. Wiki-Gendersort may be a good compromise for researchers wishing to use a completely free tool. Other studies would be useful to evaluate the performance of these tools in other populations (e.g., Asian).
Cevik M, Haque SA, Manne-Goehler J, Kuppalli K, Sax PE, Majumder MS, Orkin C. Gender disparities in coronavirus disease 2019 clinical trial leadership. Clin Microbiol Infect. 2021 Jul;27(7):1007–10. DOI: https://doi.org/10.1016/j.cmi.2020.12.025.
Gottlieb M, Krzyzaniak SM, Mannix A, Parsons M, Mody S, Kalantari A, Ashraf H, Chan TM. Sex distribution of editorial board members among emergency medicine journals. Ann Emerg Med. 2021 Jan;77(1):117–23. DOI: https://doi.org/10.1016/j.annemergmed.2020.03.027.
Sebo P, de Lucia S, Vernaz N. Gender gap in medical research: a bibliometric study in Swiss university hospitals. Scientometrics. 2021 Jan;126:741–55.DOI: https://doi.org/10.1007/s11192-020-03741-w.
Carr PL, Gunn CM, Kaplan SA, Raj A, Freund KM. Inadequate progress for women in academic medicine: findings from the National Faculty Study. J Womens Health (Larchmt). 2015 Mar;24(3):190–9. DOI: https://doi.org/10.1089/jwh.2014.4848.
Jagsi R, Griffith KA, Stewart A, Sambuco D, DeCastro R, Ubel PA. Gender differences in the salaries of physician researchers. JAMA. 2012 Jun 13;307(22):2410–7. DOI: https://doi.org/10.1001/jama.2012.6183.
Ly DP, Seabury SA, Jena AB. Differences in incomes of physicians in the United States by race and sex: observational study. BMJ. 2016 Jun 7;353:i2923. DOI: https://doi.org/10.1136/bmj.i2923.
Ley TJ, Hamilton BH. Sociology. The gender gap in NIH grant applications. Science. 2008 Dec 5;322(5907):1472–4. DOI: https://doi.org/10.1126/science.1165878.
Filardo G, da Graca B, Sass DM, Pollock BD, Smith EB, Martinez MA-M. Trends and comparison of female first authorship in high impact medical journals: observational study (1994-2014). BMJ. 2016 Mar 2;352:i847. DOI: https://doi.org/10.1136/bmj.i847.
Bendels MHK, Dietz MC, Brüggmann D, Oremek GM, Schöffel N, Groneberg DA. Gender disparities in high-quality dermatology research: a descriptive bibliometric study on scientific authorships. BMJ Open. 2018 Apr 13;8(4):e020089. DOI: https://doi.org/10.1136/bmjopen-2017-020089.
Bendels MHK, Brüggmann D, Schöffel N, Groneberg DA. Gendermetrics of cancer research: results from a global analysis on lung cancer. Oncotarget. 2017 Oct 26;8(60):101911–21. DOI: https://doi.org/10.18632/oncotarget.22089.
Larivière V, Ni C, Gingras Y, Cronin B, Sugimoto CR. Bibliometrics: global gender disparities in science. Nature. 2013 Dec 12;504(7479):211–3. DOI: https://doi.org/10.1038/504211a.
Lerchenmueller MJ, Sorenson O, Jena AB. Gender differences in how scientists present the importance of their research: observational study. BMJ. 2019 Dec 16;367:l6573. DOI: https://doi.org/10.1136/bmj.l6573.
Santamaría L, Mihaljević H. Comparison and benchmark of name-to-gender inference services. PeerJ Comput Sci. 2018 Jul 16;4:e156. DOI: https://doi.org/10.7717/peerj-cs.156.
Reza N, Tahhan AS, Mahmud N, DeFilippis EM, Alrohaibani A, Vaduganathan M, Greene SJ, Ho AH, Fonarow GC, Butler J, O’Connor C, Fiuzat M, Vardeny O, Piña IL, Lindenfeld J, Jessup M. Representation of women authors in international heart failure guidelines and contemporary clinical trials. Circ Heart Fail. 2020 Aug 1;13(8):e006605. DOI: https://doi.org/10.1161/circheartfailure.119.006605.
Karimi F, Wagner C, Lemmerich F, Jadidi M, Strohmaier M. Inferring gender from names on the Web: a comparative evaluation of gender detection methods. arXiv:1603.04322. 2016;53–4. Available from: <https://arxiv.org/abs/1603.04322>.
Menéndez DA. Damegender: writing and comparing gender detection tools. [Preprint]. 2020 May 7 [cited 2020 Dec 12]; Available from: https://easychair.org/publications/preprint/GT7d.
Hostettler S, Kraft E. Statistique médicale 2019 de la FMH: forte dépendance de l’étranger. Bull Méd Suisses. 2020 Mar 25;101(13):450–5. DOI: https://doi.org/10.4414/bms.2020.18725.
Carsenat E. Inferring gender from names in any region, language, or alphabet. [Preprint]. 2019 [cited 2020 Dec 12]. Available from: <http://rgdoi.net/10.13140/RG.2.2.11516.90247>. DOI: http://dx.doi.org/10.13140/RG.2.2.11516.90247.
Bérubé N, Ghiasi G, Sainte-Marie M, Larivière V. Wiki-Gendersort: Automatic gender detection using first names in Wikipedia. SocArXiv. 2020 Mar [cited 2020 Dec 12]. Available from: <https://osf.io/ezw7p>.
Wais K. Gender prediction methods based on first names with genderizeR. The R Journal. 2016 Jan;8(1):17–37. DOI: http://dx.doi.org/10.32614/RJ-2016-002.
NamSor. namsor/namsor-powerbi-connector [Internet]. 2020 [cited 2020 Dec 12]. Available from: <https://github.com/namsor/namsor-powerbi-connector>.
nicolasberube. nicolasberube/Wiki-Gendersort [Internet]. 2020 [cited 2020 Dec 12]. Available from: <https://github.com/nicolasberube/Wiki-Gendersort>.
Matias JN. How to ethically and responsibly identify gender in large datasets [Internet]. MediaShift; 2014 [cited 2021 Jan 30]. Available from: <http://mediashift.org/2014/11/how-to-ethically-and-responsibly-identify-gender-in-large-datasets/>.
Peters SAE, Norton R. Sex and gender reporting in global health: new editorial policies. BMJ Glob Health. 2018;3(4):e001038. DOI: https://doi.org/10.1136/bmjgh-2018-001038.
Copyright (c) 2021 Paul Sebo
This work is licensed under a Creative Commons Attribution 4.0 International License.