Seeing double: Eliminating duplicate references in drug safety literature screening

Drug safety literature screening for patient safety issues is not easy an easy process. As we saw in our post on keeping literature monitoring searches up to date, it can be hugely time-consuming and any errors can be punished severely by regulators.

Add the challenge of duplicate references into the mix and the task becomes harder still. For pharmaceutical organizations that specialize in specific therapeutic areas, search strategies for different drugs are highly likely to produce duplicate references. Every time a duplicate reference is reviewed, it doubles the cost: for many organizations, dealing with duplicates can add 30% to the cost of literature review. At Dialog Solutions, we’ve worked with large pharmas whose references are one-third duplicates.


How duplicates happen

This example illustrates how easy it is for duplicate references to be retrieved: in this case, the way in which the author and journal names are referenced is all that’s different. A human reviewer might have picked this up, but some database search systems would miss it.


Adding time and cost to the medical review process is just the first of an escalating series of issues caused by duplicate references. If a duplicate slips through the net and is sent on to case processing for a potential patient safety issue to be investigated, that’s an additional, higher, unnecessary cost. Finally, if a duplicate reference is submitted to the regulator, it could prompt an Inspection finding, making a CAPA (corrective and preventative action) necessary.

Another common source of duplicate reference is articles published in relation to clinical trials in conference proceedings and journal articles. Inevitably, there will be multiple references – we have seen up to 18 in one instance – discussing precisely the same patient safety issue. This causes a particular problem at the case processing stage.

Other sources of duplicates include articles published in multiple databases, publication status changes, and duplicates resulting from data migration.


How Drug Safety Triager deals with duplicates in drug safety literature screening

In an ideal process, pharmacovigilance teams would simply note duplicate references while making sure that only one relevant article is reviewed and processed.

Dialog’s Drug Safety Triager has a simple and elegant way of dealing with duplicates when same article is retrieved for multiple drugs. Rather than importing duplicate references for different drugs, it imports a reference once and tags it with the drugs to which it is relevant. This way, no duplicate reference is created. And it can be handled as a single reference covering multiple products. It’s worth bearing in mind that 95% of references that are imported or reviewed by drug safety departments are irrelevant for safety: with this approach, a reference can be flagged (or not) for a safety issue with a single click for multiple drugs.

Robust algorithms can do the heavy lifting of automatic deduplication. Dialog Solutions provides auto-deduplication and Drug Safety Triager provides an additional manual deduplication option. Depending on the content provider, data can be “dirty” and not normalized, so an additional deduplication algorithm can be applied within the literature monitoring software. This way, potential duplicates are brought to the attention of the reviewer, who then decides whether or not the reference is a duplicate and needs to be assessed.

Other deduplication mechanisms are based on digital object identifier (DOI) and other publication elements, deduplication across multiple databases, or take advantage of the alert memory.