Finding patented biological sequences for patents without a Sequence Listing

In recent years it has become more difficult to get a comprehensive view of the biosequence patent landscape. According to Clarivate research, roughly 18% of patent filings in this field do not include a Sequence Listing. To help mitigate the consequences of this information gap, we have developed a solution.

Patent filings with biological sequences, i.e. nucleotide or amino acid sequences, often have a Sequence Listing File prepared by the patentees for national patent offices. This Sequence Listing File presents the biological sequence data in a standardized format.

However, the Sequence Listing is not required and many patent filings for biological sequences do not include this additional documentation. As a result, a significant portion of biosequence patents become difficult to retrieve using conventional patent search tools.

To understand how many biosequence patents are filed on average without a Sequence Listing, Clarivate™ analysts reviewed more than 34,500 patent applications to the United States Patent and Trademark Office (USPTO), China National IP Administration (CNIPA) and the World IP Office. Based on this review, an estimated 18% of biosequence patent applications in 2021 were submitted to these patent offices without a Sequence Listing.

Figure 1: Estimated percentage of 2021 patent filings by jurisdiction where the Sequence Listing was not provided by the patentee)

Source: GENESEQ™

While the percentage of patent applications filed without a Sequence Listing has been decreasing over time for the three jurisdictions studied, the proportion is still significant, particularly for patents filed with the CNIPA. The biggest decrease in patent filings without a sequence was observed for World IP Office (WIPO) applications—down from 29% in 2017 and 28% in 2019.

To help researchers bridge this information gap, Clarivate has developed the GENESEQ database. Using a proprietary process that incorporates machine learning and human review, our team identifies and indexes all sequences found in biosequence patent filings, regardless of whether a Sequence Listing is available.

By indexing both the sequences presented in the Sequence Listing and the sequences disclosed in patent filings, GENESEQ provides a more comprehensive view of the biological sequence patent landscape. This allows researchers to more accurately assess the novelty of a new sequence and capture a complete picture of the patent landscape surrounding specific sequences.

To find out more about the GENESEQ database and related IP solutions that can help you make data-driven decisions with speed and confidence, contact us.