A glance at Brian Uzzi’s background – degrees in economics, psychology, and sociology – might not immediately suggest an affinity for big data. But the Northwestern University professor, particularly in his capacity as co-director of the Northwestern Institute on Complex Systems (NICO), routinely plumbs vast stores of numbers to answer questions about how scientists – and others – create, innovate, and collaborate.
For Uzzi, the revelation on the power of big data was provided by a summer internship at the Santa Fe Institute, a nonprofit New Mexico center at which visiting scientists and scholars explore the dynamics and effects of complex systems. He was soon a convert.
“Big data allows us to answer new questions about universal patterns in social behavior and reevaluate outstanding questions in social science,” he says, while also acknowledging the challenges. “The approach required new research skills involving database management, machine learning, and leading scientific teams.”
From science to…Broadway?
Uzzi’s embrace of big data soon led to striking insights into the nature of collaboration and teamwork. In particular, through work with large data sets from Clarivate Analytics Web of Science, Uzzi and colleagues have enlarged the understanding of how certain compositions of teams – often blending conventional experience with an outside or novice perspective – frequently register as more successful at producing high-impact, successful work.
For example, in one study published in 2005 (R. Guimera, et al., Science, 308: 697-702, 2005), Uzzi and his coauthors examined how teams typically assemble in two seemingly disparate spheres of activity: scientific research, as embodied in four fields represented by scholarly publications tracked in the Web of Science component Journal Citation Reports; and the world of Broadway musicals. Whatever their apparent differences, both enterprises necessitate the combination of individuals possessing a range of specialized skills. Uzzi and colleagues noted a similarity of dynamics in these worlds of art and science: Teams tend to self-assemble to an ideal size; success often hinges on a level of diversity that includes both experienced “incumbents” and newcomers; and incumbents form networks whose members tend to collaborate on future projects.
In a 2007 report, Uzzi and colleagues examined another aspect of collaboration: The increasing prevalence of teamwork in the production of knowledge, as opposed to the labors of solitary researchers (S. Wuchty, B.F. Jones, B. Uzzi, Science: 316: 1036-9, 2007). The team examined nearly 20 million research indexed in the Web of Science (and published as far back as the mid-1950s – or, in the case of the arts and humanities, the mid-1970s), along with more than 2 million patent records.
Uzzi and his colleagues noted a “substantial shift toward collective research.” Although this development was explainable in large part by the increasing scale, complexity, and expense of big science, the trend also held true in the social sciences. Patenting, too, witnessed a rise in team production.
Consulting citation figures in the Web of Science, the researchers made another striking observation: In a reversal of the trend of a half-century ago, when a single author was more likely to write a highly cited report, by 2007 a team-authored paper was 6.3 times more likely to garner more than 1,000 citations.
Subsequently, Uzzi and his coauthors returned to the Web of Science to examine 4.2 million papers indexed between 1975 and 2005 (B.F. Jones, S. Wuchty, B. Uzzi, Science, 322: 1259-62, 2008). Their investigation demonstrated another trend in research teamwork: a rise in collaboration between authors located at different universities. The increase, noted the authors, was largely consistent across time and appeared to pre-date the rise of the internet and other improvements in communication technology. A further observation was that elite universities were central to this shift, and that collaborations that included a top-tier university were more likely to produce high-impact work. “Thus,” wrote the authors, “although geographic distance is of decreasing importance, social distance is of increasing importance in research collaborations. Elite universities are more intensely interdependent, playing a higher-impact and increasing visible role in [science and engineering and the social sciences].”
Mixing convention and novelty
For a 2013 report, sharpening their observations on the nature of scientific creativity, Uzzi and colleagues examined a sample of nearly 18 million Web of Science-indexed papers (B. Uzzi et al., Science, 342: 468-72, 2013). Specifically, the team scrutinized the prior work cited by these papers, using sophisticated analysis to distinguish which citations could be counted as “conventional” – that is, within an expected grouping of subject-related journals – and which citations represented novel pairings incorporating different subject matter.
In tabulating citations to the original sample of papers, Uzzi and his coauthors noted that “the highest-impact science draws on primarily highly conventional combinations of prior work, with an intrusion of combinations unlikely to have been joined together before.” As the authors conclude, such papers “have nearly twice the propensity to be unusually highly cited.”
Finding the “hotspot”
In a recent paper, Uzzi and colleagues examined two particularly large datasets: some 28 million papers indexed in the Web of Science between 1945 and 2013, and more than 5 million US patent records filed between 1950 and 2010. In particular, analysis centered on the prior documents cited by these works – in particular, the distribution denoting the differences between a work’s publication year and the publication years of the work’s cited references (S. Mukherjee, et al., Science Advances, 13 : e1601315, 19 April 2017).
Ultimately, the authors identified an age distribution in prior literature that was unmistakably associated with higher impact. They dubbed this the “hotspot” – a convergence of cited references having a low mean age and a high age variance relative to the work’s publication year. Works within this hotspot, as the authors conclude, more than double their probability of being in the top 5% most-cited in their respective areas. As they note: “Works outside of the hotspot – work centered on recent papers, old papers, or a broad sample of new and old works – do no better than expected by chance.”
Equally striking is the authors’ observation that, based on their ongoing work, this hotspot – the phenomenon of drawing upon an ideal mix of prior knowledge in the production of high-impact work – seems to apply to fields beyond science and technology. The team’s analysis of the most influential legal decisions in the United States and India, for example, suggests that a similar mechanism is at work in the area of law.
Moreover, the ability to hit the hotspot is another reflection of collaboration and its advantages, as multi-authored papers are far more likely to incorporate the ideal mix of prior sources than are those written by single authors.
The matter of why teams are associated with higher-impact work, as the researchers note, is “still an open question and may be related to several explanations that still need to be tested.” These possible factors include a division of labor, collective intelligence, the benefits of specialization, and positive competition among team members.
As Uzzi and his coauthors write: “Amidst these new questions and directions for future work, our findings reveal that the age of information is a remarkably powerful and heretofore unknown predictor of high-impact work in science and technology.”
Meanwhile, other questions have presented themselves. One of Uzzi’s current interests is in the relationship between scientific mentors and their students with regard to scientific performance.
With its store of indexed literature extending back more than a century, and its citation data illuminating the interrelationships between collaboration and influence, the Web of Science will continue to be a resource for exploring these questions.