The inconsistency of h-index: A mathematical analysis
Loading...
Official URL
Full text at PDC
Publication date
2020
Advisors (or tutors)
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Citation
Brito, R. and Rodriguez Alonso, A. The inconsistency of h-index: A mathematical analysis Journal of Informetrics, 15, 101106 (2021)
Abstract
Citation distributions are lognormal. We use 30 lognormally distributed synthetic series of numbers that simulate real series of citations to investigate the consistency of the h index. Using the lognormal cumulative distribution function, the equation that defines the h index can be formulated; this equation shows that h has a complex dependence on the number of papers (N). We also investigate the correlation between h and the number of papers exceeding various citation thresholds, from 5 to 500 citations. The best correlation is for the 100 threshold but numerous data points deviate from the general trend. The size-independent indicator h/N shows no correlation with the probability of publishing a paper exceeding any of the citation thresholds. In contrast with the h index, the total number of citations shows a high correlation with the number of papers exceeding the thresholds of 10 and 50 citations; the mean number of citations correlates with the probability of publishing a paper that exceeds any level of citations. Thus, in synthetic series, the number of citations and the mean number of citations are much better indicators of research performance than h and h/N. We discuss that in real citation distributions there are other difficulties.