Introducing Citation Topics in InCites

Citation Topics are a new document-level classification schema for InCites Benchmarking & Analytics™ and were developed with the expertise of the Centre for Science and Technology Studies in Leiden and the Institute for Scientific Information (ISI)™. Read the post below to learn more about this update.

Research is a dynamic ecosystem. As authors advance their fields by publishing new papers that cite existing ones, research topics diversify, new topics debut and older topics fall into decline. Although citation networks such as the Web of Science Core Collection™ capture this activity, it can be difficult to assess which new concepts are emerging, growing or declining at scale. Existing classification schema, such as the established Web of Science™ subject categories, provide a stable, reliable and useful way to compare the output of nations and institutions, and model change across decades. However, because they are based on entire journals, books and conference proceedings, they mask the dynamism that occurs across and within categories at the document level.

To help you better understand the ever-evolving landscape of ideas and assess performance within it, we’re excited to introduce Citation Topics into InCites Benchmarking & Analytics.


What are Citation Topics?

Citation Topics represent groups of papers related to one another via citation. Constructing an article-level topic schema across almost 70 million documents poses a significant challenge. Although algorithms can meaningfully cluster documents based on their relationships to one another at scale, representing this data in a way that humans can interpret and use is another matter entirely.

Under the stewardship of the ISI, the InCites product team worked with the Centre for Science and Technology Studies (CWTS) in Leiden to develop and deploy a citation-based classification algorithm. As published papers cite one another, the strength of these citation relationships pull related documents together into discrete clusters of related documents. These clusters form the core of our Citation Topics, which are independent of a document’s subject and contents, and instead represent domains where authors are actively citing each other’s papers.

Citation topics are ‘live’ research ­– all newly published documents are added to existing topics, and a yearly update ensures that topics continue to accurately reflect changes in the underlying literature.


The benefits of a topic hierarchy

Part of our development process involved solving the problem of how to display the data to make it most useful in your assessments. Small, tightly clustered topics offer granularity in any analysis, but the wider picture can be missed. Larger, broader topics present helpful summaries, but omit detail. Our approach with Citation Topics was to construct a three-level hierarchy of macro, meso and micro-topics, which enables you to choose the right level of granularity for your analysis, depending on the question you have.

With the help of CWTS and ISI, we generated a hierarchy featuring 10 broad macro-topics, 326 meso-topics and 2,444 micro-topics. InCites customers can drill down through the detail from the broad macro-topics to the narrow micro-topics, analyzing at each level.



What’s in a name?

Citation Topics need meaningful, descriptive names. Citation clustering only assesses the relationship between documents and nothing about their subject matter. With the help of the ISI, we labelled each of the macro and meso categories based on their content. The large number of micro-topics led us to a different approach – using an algorithmic tool to label each based on the most significant keyword. This gave us representative naming across nearly 2,500 micro topics.


Using Citation Topics

With this new three-level, document-based Citation Topic schema, InCites users can now perform more granular analyses of researcher, organizational, country/regional and funding agency output. Publication sources that were once categorized into a small number of subject categories can now be profiled at increased resolution.


Citation Topics have their own baselines and a full set of normalized indicators to help you accurately assess impact. To help you use them more easily, we’ve also introduced a new interactive visualization. We look forward to hearing your feedback, and working with the InCites user community to explore new ways of understanding the research landscape.

Try it for yourself. Go to InCites.