Getting the Full Picture: Institutional unification in the Web of Science

Originally published October 20, 2016. Updated March 8, 2024

In the sprawling and complex world of research, attribution matters. Individual researchers, teams and institutions each deserve credit for their contributions and this credit underpins career advancement, research funding and trust in the scholarly record.

However, with over 2.5 million research papers produced each year, keeping track of what happens where and who does it is no easy feat. The research community has developed a variety of identification systems to manage this. For example, popular tools like ORCID and Web of Science™ Researcher Profiles provide individuals with mechanisms for claiming their work over time as they change institutions or adopt new names. By disambiguating author name metadata on publications, such tools improve discoverability across systems and support responsible research evaluation.

For stakeholders who need to evaluate research at scale, working from a dataset where the names of research-producing organizations have been standardized is equally important.

The problem: various variants

It is common practice for authors to list their affiliated institution and mailing address on research papers. However, individuals working at the same institution may do this quite differently. Large universities, corporations and government institutions are typically home to various named laboratories, research facilities and outposts—some of which are spread around the globe. When individuals working within these smaller organizations report their findings in published papers, they often list the lab or center name as their affiliation—not the parent institution. Together with commonly used acronyms and language variations, organization name metadata begins to look like a vast tangle when considered at scale.

To illustrate the wide range of names that may be subsumed under a parent body, consider one government organization: Fisheries and Oceans Canada. For this organization, author affiliations listed in the original, published papers include “Canada Dept. of Fisheries & Oceans”, “Department of Fisheries and Oceans”, “Department of Fisheries and Oceans Canada (DFO)” and many more, including the “Bedford Institute,” which at face value indicates no relationship to the parent institution.

The solution: institutional unification

In partnership with customers and the research community, we mitigated this problem in the Web of Science Core Collection™ and related tools such as InCites Benchmarking & Analytics™ by implementing a thorough and consistent process for identifying, disambiguating and unifying organization name variants present in the literature. Since 2012, our database specialists have meticulously compiled organization name and address variants to create rules in our systems that accurately attribute publications to institutions. These rules guide the process by which address variants in Web of Science records are automatically treated in the database. Examples include:

  • “Wharton School” unified under the “University of Pennsylvania”
  • “Royal Free Hospital” unified under “UCL Medical School,” which in turn is unified under “University College London”
  • “Cerner Corporation Australia” unified under “Cerner Corporation,” which in turn is unified under “Oracle”

Embodied most visibly in the Web of Science Core Collection Affiliation search feature, this unification enables users to efficiently retrieve an accurate and complete record of an institution’s research output with one action, or to conduct a more focused search on research being done in specific centers within the parent organization. It also ensures that the publication- and citation-based metrics in our products reflect the contributions of all affiliated or component facilities.

Figure 1 – Examples of some suborganizations and their name variants for Fisheries & Ocean Canada

An ongoing process

As of March 2024, the extensive store of variant organization names and addresses recorded in our systems now covers over 18,500 institutions worldwide and is reflected in over 76 million Web of Science records. These numbers continue to grow: the unification process is a continuous, year-round effort which we actively explore to improve with new methods. In 2023, we expanded organization unification depth in the Web of Science Core Collection to include departments by algorithmically mapping department data indexed in the author affiliations to standardized and unified forms of the department name. This means that the dataset now includes existing organization hierarchies beyond what can be found in the published literature.

Organization unification is a challenge across academic publishing, with many solutions available to support different use cases. Like ORCID’s identifiers to researchers, the Research Organization Registry (ROR) is an open organization solution that is gaining traction. As a long-standing supporter of both ORCID and ROR, Clarivate recognizes the value and benefit of open identifier initiatives for the community. Based on customer feedback, we plan to add ROR identifiers in the Web of Science later in 2024 to further enrich our organization unification methods.

Sharing feedback

From the beginning, community feedback has been essential to our unification process. Our database specialists often work in partnership with organizations to identify new variants and validate rules. Any organization—regardless of whether it holds a subscription to our products– can request to view or change its profile details used in our unification, which include preferred name, name variants, organization hierarchy and address variants.

To submit a request for information or changes for your organization, use our Data Corrections form and choose “Organizations-enhanced” as the Change type.