Artificial intelligence: Hype vs. reality and the impact on the patent industry


The generally accepted definition of artificial intelligence (AI) is the demonstration of intelligence by machines. In computer science AI research is defined as the study of “intelligent agents”: any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.  More commonly, it’s a term that is used when we use a machine to mimic cognitive human functions such as learning and problem-solving.

The English mathematician Alan Turing developed a test in 1950 of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human being called, naturally enough, The Turing test.

He proposed that a human evaluator would judge natural language conversations between a human and a machine. The evaluator would be aware that one of the two partners in conversation is a machine, and all participants would be separated from one another. If the evaluator is unable to distinguish between the responses from the machine and the human, the machine is said to exhibit AI.

Over time, there has been a lot of research being done in the areas of mathematics, cognitive science and other areas, but AI really took off in the late 1990s with the advent of computing technology which has allowed AI to make great advances and to result in some tangible applications in various areas.

AI has found application in a number of general areas.

  • Speech recognition, as in the SIRI application used in smart phones
  • In content creation, an interesting example of using AI is from a company called Wibbitz which is using artificial intelligence to create video content completely automatically from news stories.
  • AI is clearly useful in military simulations where training can be performed in real life-like situations without exposing troops to unnecessary risk
  • Autonomous vehicles make use of machine intelligence to sense their environment and make intelligent decisions based on that information
  • And AlphaGo is a machine developed to play a game called GO which has simple rules, but requires hugely complex strategy in order to win. The AlphaGo machine developed to the stage where it beat the current world champion. But it’s successor, AlphaZero, crushed the AlphaGo machine 100 to zero and can now learn the rules of any game and be superhuman at playing it.

According to Elon Musk:


AI is capable of vastly more than almost anyone knows, and the rate of improvement is exponential.” 


While this may be true, there is a lot of hype surrounding the exact capabilities and value of AI.

But what of AI in patents?  What is the reality of how AI is impacting on the patent industry?


IP Issues: Today and Future

The issues facing the patent industry are essentially the same for today and for the future, at least for the next 5 years or so.  There are three basic challenges that we face: the sheer volume of information; the language of that information with much of it being in languages other than English and a substantial amount in non-Latin character sets; and the increasing complexity and obscurity of that information.

There are three basic challenges that we face: the sheer volume of information; the language of that information; and the increasing complexity and obscurity of that information.”

The volume of information is vast and growing. Last month saw the publication of US patent number 10,000,000.  That patent joins over 100 million other patent documents, 70 million plus journal articles and over 4 billion indexed web pages.

Secondly, for some time we have operated in a world where the majority of new inventions don’t have any detail in English. Of the 5.6 million patent documents published globally in 2017, over 62% are in Chinese, Japanese or Korean, often with no English language equivalent.

Thirdly, innovation is increasingly taking place at the intersection of different technologies and understanding that innovation requires more and more highly complex and specialist knowledge.  The information in patents is complicated. Its value is hidden behind a barrier of required or expected knowledge.

These issues taken together challenge our use of patents.  There is simply way too much data for a human to read, analyze and understand.  The skills necessary to find, read and digest all information of relevance probably do not reside in a single person anymore. And that’s where AI has the potential to help.


How AI can help

AI can help with search and retrieval of information.  Natural language processing and semantic search have progressed to the point where they are now useful tools. AI can also help us in managing the challenges from the beginning of the process with the loading and checking of raw data; it can help in classifying the data and with the analysis and interpretation of data and presenting it in a way that helps us make intelligent decisions.

At a recent webinar hosted by Clarivate Analytics, Tom Fleischman, Master Inventor at IBM explained how the IBM T&IP (Technology and Intellectual Property) Organization is using the Watson engine technology to ingest, digest, understand and analyze patent data to provide insights in the work that they perform.

Watson was originally developed as a trained computer system with a large amount of data designed to play against other players from the TV show, “Jeopardy.”  It made the decisions about what categories to pick, what dollar values to pick, whether or not to buzz in and so on. It was fed the question, it interpreted the question using natural language processing, and then used its artificial intelligence to determine the right answer, buzz in and it actually won over the course of a two-day period.

The current version of Watson is a collection of APIs and algorithms that grew out of the Watson engine technology and enables users to rapidly develop AI Solutions.  The IBM T&IP team have developed an AI system known as IP Advisor with Watson which consumes patent and technology data for the users using natural language processing and understanding and identifies and provides insights and connections using all the available data and relevant algorithms.  They are applying it to multiple use cases common in IP monetization:

  • Evidence of Use
  • Prior Art
  • Landscaping/Portfolio Analysis
  • Maintenance decisions
  • Product coverage

Tom Fleischman comments:

I believe we must begin to train AI machines to ingest, digest, understand and analyze the tremendous amount of data and to provide insights. This is not necessarily to give us the answer, but to provide insights that help reach towards an answer.  The insights provided should be used as a guide – in this sense, at IBM we call AI ‘Augmented Intelligence’ rather than Artificial Intelligence.  It shouldn’t be used to replace human thinking – it’s meant to augment human thinking.  Think of it as a co-worker.”


At Clarivate also, we are using AI to help manage the enormous challenges of producing and analyzing patent information sources.  In fact, we’ve been doing this for some time now. The production of the Derwent World Patents Index database necessitated development of algorithmic approaches some 30 years ago.

Examples of this include algorithmic identification of duplicate patents – what we call the “basic-equivalent search” – which scours a set of data points to look for matches. Where it finds a one to one match, it updates the existing DWPI family with detail of the equivalent patent publication. Where it finds less than a one to one match, it passes the information to our editorial teams for a review of the claims for similarity.  This algorithm is actually at the heart of the Derwent method – the creation of a global ideas and invention database rather than a “patent” database.

The Derwent title and abstracts are intellectually written by a team of scientists and engineers, but that production environment rests on a well-honed system of automated routing and patent language-specific automated translations systems meaning we can apply our technical experts’ knowledge as efficiently as we can.

But all of that is just on the production side – what about for the user or analyst?

We continue to invest and develop AI solutions to help in the retrieval and analysis of information. There are three specific examples of this:


Smart Search

Semantic searching is by no means new. However most semantic engines are generic text, or generic technology language engines. One of the benefits of having spent 50 years hand making a database is that we have a gold standard resource for training algorithms. And this is what Smart Search rests on – the fully English language nature of Derwent, its manual classification methodology to which semantic techniques are then applied. The effect of this is to allow the researcher to benefit from the expertise of the DWPI editorial team to find highly relevant answers very quickly.

Optimized Assignee and Ultimate Parent

Establishing true ownership of a patent can be a significant challenge.  Name variation, misspellings, missing assignee names, re-assignment and corporate structure are all factors that must be considered and can cause problems. Using automated processes and AI machine learning, we have developed two data fields within the Derwent Innovation patent records that help address the challenges: Optimized Assignee and Ultimate Parent.

Optimized Assignee provides a single preferred entity name which is not only a normalized company name, but also predicts missing assignees, and takes corporate structure into account.  It includes the probable Assignee (where no organization is listed on the application) and considers the latest reassignment, company hierarchy, and name clean-up/normalization.

Ultimate Parent provides the entity which has ultimate and current responsibility for the patent and who has the ability to exploit it. The Ultimate Parent is calculated as the top company in the hierarchy above the Optimized Assignee after accounting for reassignment and so on.


ThemeScape is a text mining algorithm with a graphical front end that provides a way of visualising the commonality of language via the physical proximity in a map. It sits on honed, fundamental text mining algorithms, but its real power comes when we can use the augmentation of hand curated content on top of automated analysis. We can apply the algorithm to the “Use” field of a set of Derwent records. The algorithm does not know the difference between industrial uses, technical terms etc., but our editorial teams do. They split out the language and give it context. The algorithm then looks at that context and provides the productivity augmentation.

So much for today.  But what of the near future?

There is opportunity for AI to assist much more in the analysis and interpretation of information to provide insight in reaching better decisions.  That can be achieved by the inclusion of more contextual data to continue the distillation process, the very essence of good analysis, the creation of brevity from complexity. Using AI to help with that is not just obvious but required.

Going even further – beyond simply analytical techniques, there is also likely the ability to augment many of the processes in which patent information is used day-to-day – how can we augment the patentability search, the validity search or evidence of use analysis?

Patent offices themselves are looking to embrace AI as part of the future for their patent, trademark and design application workflow solutions. The Japan Patent Office have announced publicly that they are investing in the use of artificial intelligence technology to automate processes such as screening patent, trademark and design applications[1].

And just last month, at a meeting in New Orleans of the IP5 patent offices (China, Europe, Japan, Korea, USA), the impact of AI on the patent system was identified as “one of the main IP5 strategic priorities to be the subject of common reflection”[2].

Whatever the future holds for the patent system, it’s a sure bet that AI will be a significant part of it.

This article is based on a recent webinar “Artificial Intelligence: Hype vs. Reality and the Impact on the Patent Industry” given by panelists Tom Fleischman, Ed White and Bob Stembridge. A recording is available here.