Methodology

Purpose

Highly Cited Researchers from Clarivate Analytics is an annual list recognizing leading researchers in the sciences and social sciences from around the world. The final new list contains about 3,400 Highly Cited Researchers in 21 fields of the sciences and social sciences. The 2017 list focuses on contemporary research achievement: only Highly Cited Papers in science and social sciences journals indexed in the Web of Science Core Collection during the 11-year period 2005-2015 were surveyed. Highly Cited Papers are defined as those that rank in the top 1% by citations for field and publication year in the Web of Science.  This data derives from Essential Science Indicators (ESI). The fields are also those employed in ESI – 21 broad fields defined by sets of journals and exceptionally, in the case of multidisciplinary journals such as Nature and Science, by a paper-by-paper assignment to a field. This percentile-based selection method removes the citation advantage of older published papers relative to recently published ones, since papers are weighed against others in the same annual cohort.

Those researchers who, within an ESI-defined field, published Highly Cited Papers were judged to be influential, so the production of multiple top 1% papers was interpreted as a mark of exceptional impact. Relatively younger researchers are more likely to emerge in such an analysis than in one dependent on total citations over many years. To be able to recognize early and mid-career as well as senior researchers was one of the goals  in generating the new list. The determination of how many researchers to include in the list for each field was based on the population of each field, as represented by the number of author names appearing on all Highly Cited Papers in that field, 2005-2015. The ESI fields vary greatly in size, with Clinical Medicine being the largest and Mathematics being the smallest. The square root of the number of author names indicated how many individuals should be selected.

One of two criteria for selection was that the researcher needed enough citations to his or her Highly Cited Papers to rank in the top 1% by total citations in the ESI field in which that person was considered. Authors of Highly Cited Papers who met this criterion in a field were ranked by number of such papers, and the threshold for inclusion was determined using the number derived through calculation of the square root of the population as represented by the number of author names in any field. All who published Highly Cited Papers at the threshold level were admitted to the list, even if the final list then exceeded the number given by the square root calculation. In addition, and as concession to the somewhat arbitrary cut-off, any researcher with one fewer Highly Cited Paper than the threshold number was also admitted to the list if total citations to his or her Highly Cited Papers were sufficient to rank that individual in the top 50% by total citations of those at the threshold level or higher. The justification for this adjustment at the margin is that it seemed to work well in identifying influential researchers, in the judgment of Clarivate Analytics’ citation analysts.

Of course, there are many highly accomplished and influential researchers who are not recognized by the method described above and whose names do not appear in the new list. This outcome would hold no matter what specific method was chosen for selection. Each measure or set of indicators, whether total citations, h-index, relative citation impact, mean percentile score, etc., accentuates different types of performance and achievement. Here we arrive at what many expect from such lists but what is really unobtainable: that there is some optimal or ultimate method of measuring performance. The only reasonable approach to interpreting a list of top researchers such as ours is to fully understand the method behind the data and results, and why the method was used. With that knowledge, in the end, the results may be judged by users as relevant or irrelevant to their needs or interests.

 

Methodology

The data used in the analysis and selection of the new Highly Cited Researchers came from Essential Science Indicators (ESI), 2005-2015, which then included 134,832 Highly Cited Papers. Each of these papers ranked in the top 1% by total citations according to their ESI field assignment and year of publication. For more information on the identification of Highly Cited Papers in Essential Science Indicators, see the ESI help file at Essential Science Indicators.

 

Essential Science Indicators

Essential Science Indicators surveys the Science Citation Index Expanded and Social Sciences Citation Index components of the Web of Science, meaning journal articles in the sciences and social sciences. The analysis is further limited to items indexed as articles or reviews only, and does not include letters to the editor, correction notices, and other marginalia.

 

Classification

In Essential Science Indicators, all papers, including Highly Cited Papers, are assigned to one of 22 broad fields (the 22nd is Multidisciplinary, on which see below). Each journal in Essential Science Indicators is assigned to only one field and papers appearing in that title are similarly assigned. In the case of multidisciplinary journals such as Science, Nature, Proceedings of the National Academy of Sciences of the USA, and others, however, a special analysis is undertaken. Each article in such publications is individually reviewed, including an examination of the journals cited in its references. The paper is then reclassified to the most frequently occurring category represented by the article’s cited references. For more information about this reclassification process, see our article at Classification of Papers in Multidisciplinary Journal.

 

Author Disambiguation

A ranking of author names in each ESI category by number of Highly Cited Papers produced during 2005-2015 determined the identification and selection of our new list of highly cited researchers. We used algorithmic analysis to help distinguish between individuals with the same name or name form (surname and initials). In instances where any ambiguity remained, manual inspection was needed. This entailed searching for papers by author surname and one or multiple initials, ordering them chronologically, visually inspecting each (noting journal of publication, research topic or theme, institutional addresses, co-authorships, and other attributes), and deciding which ones could be attributed to a specific individual. As noted in the FAQ section, we examined original papers, if necessary, as well the websites of researchers themselves and their curricula vitae. This was often required if a researcher changed institutional affiliations several times during the period surveyed.

 

Getting to the final result

Once the data on Highly Cited Papers within an ESI field were verified and assigned to specific individuals, the authors in the field were ranked by number of Highly Cited Papers. To determine how many researchers to select for inclusion in the new list, we considered the size of each ESI field in terms of number of authors (as a proxy for population) represented on the Highly Cited Papers for the field. The ESI fields are of very different sizes, the result of the definition used for the field which includes the number of journals assigned to that field. Clinical Medicine, for example, makes up some 18.6% of the content of ESI while Economics and Business, Immunology, Microbiology, and Space Science (Astronomy and Astrophysics) account for 1.8%, 1.8%, 1.4%, and 1.0%, respectively. For each ESI field, author names (before use of the disambiguation algorithm and therefore not disambiguated) were counted, and then the square route of that number was calculated. That number was used to decide approximately how many researchers to include in each ESI field. From the list of authors in a field ranked by number of Highly Cited Papers, the number of papers at the rank represented by the square root score determined the threshold number of Highly Cited Papers required for inclusion. If an author had one fewer Highly Cited Paper than this threshold, but whose citations to their Highly Cited Papers were sufficient to rank them in the top 50% by citations among those with Highly Cited Papers at or above the threshold, these individuals were also selected. In addition, citations to an individual’s Highly Cited Papers had to meet or exceed the threshold for total citations used in the 2005-2015 version of ESI for including a researcher in the top 1% (highly cited list) for an ESI field.

In a few fields, such as chemistry, engineering, and materials science, there are many Chinese names that appear on Highly Cited Papers. These name forms (especially surname and initials) often represent multiple researchers. Manual inspection often results in removal of a name since none of the individuals represented by the name form qualify for selection. This adjustment occurs so frequently in certain fields that there are significantly fewer researchers then the square root number who have published the threshold number of Highly Cited Papers determined from analysis of the raw data. In these cases, the threshold number of Highly Cited Papers in a field is reduced until the square root number of disambiguated researchers is obtained. For example, before this procedure the required number of Highly Cited Papers in Chemistry, 2005-2015, was 21 but after disambiguation and removal of false positives the threshold number was 15.

 

Exceptions

The methodology described above was applied to all ESI fields with the exception of Physics. The relatively large number of Highly Cited Papers in Physics and in Space Science (Astronomy and Astrophysics) dealing with high-energy experiments and large-team space missions, respectively, typically carried hundreds of author names. Using the whole counting method produced a list of high-energy physicists only or those participating in large-team space missions only and excluded those working in other subfields. It was decided to eliminate from consideration any paper with more than 30 institutional addresses in the Physics and Space Science categories. This removed the problem of overweighting to high-energy physics or large-team space missions.

 

Exclusions

Finally, we excluded retracted articles and researchers in our final analysis of Highly Cited Papers. Researchers found to have committed scientific misconduct in formal proceedings conducted by a researcher’s institution, a government agency, a funder or a publisher were excluded from our list of Highly Cited Researchers.

 

 

Essential Science Indicators

See where science is going and who’s leading the way.

Find out more

Explore Web of Science

Search with confidence and explore deep citation connections.

Learn more