Putting more into our data, so you get more out of it

By Robert Reading,
Director, Professional Services & Strategy for CompuMark

Ask any celebrated chef the key to a great dish and the response is likely to be this: start with the best ingredients. So it is with trademark research.

Without reliable source data, it is impossible to deliver research results that trademark professionals can count on to make critical decisions. That’s why CompuMark invests so much time and so many resources on building a content library that is accurate, complete and reliable.

Expanding data universe

CompuMark’s content library is vast, rich and constantly expanding. It spans the globe, covering more than 200 jurisdictions, more than 90+ years of records, and contains information on brands of all stripes—from household names to small, local businesses. Maintaining this library is the task of CompuMark’s dedicated quality team, supported by a worldwide network of sources with expertise in their local languages, cultures and trademark nuances.

The challenge of managing global trademark data has grown dramatically in both scope and complexity in recent years. The rise of online commerce, mobile communications and social media has created an explosion of digital content. This digital universe is projected to reach 44 trillion zettabytes by 2020*.

We’ve seen this growth reflected in our own proprietary trademark library. In 2018, we added nearly 11 million new trademark records to our database and updated or enhanced more than 40 million. In the years ahead, our database will expand from millions to billions. But we’re ready for the challenge.

Investing in quality content

CompuMark recognized years ago the implications of this digital content explosion on brands. We have steadily invested in people, tools and processes to stay ahead of the curve.

Prime among these challenges is helping to ensure trademark record accuracy and consistency across global jurisdictions. Trademark data varies in both quality and availability from one country to another. In some jurisdictions, trademark records are not available in electronic form. There are significant differences in the way data is presented, how much detail is provided with each record, and the time frame for the trademark examination and registration process. These issues are layered on top of the obvious differences in languages, alphabets and character sets.

In addition, errors in trademark records are not uncommon. In fact, CompuMark has found that, on average, 7% of U.S. Patent and Trademark Office (USPTO) records and 4% of European Union Intellectual Property Office (EUIPO) records have errors that required correction.

Catching and correcting errors

To make trademark content useful for our customers, CompuMark performs a range of data quality checks and enhancements to help ensure all records are accurate, complete and consistent.

CompuMark quality experts review official records prior to entering them in our proprietary database. They look for errors and inconsistencies, including misspellings, missing or partial field entries, date errors, and classification discrepancies. When these are identified, our quality team works with our data partners and directly with trademark offices to correct these issues.

Consider the following examples of trademark records corrected by CompuMark’s quality team:

Record Received: “LOC FROST, Dessin Et Slogan”

Corrected record: “LOCFROST for an undisturbed sleep”

Record Received: “Watermark Design”

Corrected record: “Watermark Building Solutions LTD.”

Record Received: “M”

Corrected record: “The mark consists of the letter “M” adjacent to two perpendicular lines.”

Errors like these can have serious implications for a trademark search or watch. For example, If a relevant mark is misspelled, includes a typographical error, or is misclassified, it could be overlooked when a query is run. By identifying and correcting errors before the records enter our database, we help ensure relevant results are not missed.

Enhancements that improve results

To improve the precision and completeness of search and watch results, our data quality team also adds enhancements to records in our proprietary database. Leveraging our deep trademark knowledge and experience, our quality experts add cross-referencing and indexing to further reduce the risk of missing relevant marks. Examples include:

Spelling: COLOR /COLOUR, REALIZE /REALISE
Variants: QUICK/KWIK, TRUE/TRU, WONDERFUL/ONDERFUL, LOVELY/LUVLI
Number or word substitutions: SK8/ SK8TER, FOR/4, TO/2
Soundalikes or abbreviations: BARBECUE, BARBEQUE, BBQ
Texting abbreviations: YOU / U, YOU ARE /UR, LOL, LMK
Spacing variations: CALLEDIT / CALLED IT /CALL EDIT

We add these enhancements to approximately 30% of records we onboard. That means research providers who do not perform this cross-referencing could be missing relevant results. Can you afford to miss those?

Meeting the challenge of constant change

Like the world itself, the trademark landscape is constantly changing. CompuMark addresses the challenge head-on, continually updating our database to keep pace and ensure our customers don’t miss critical results. Here are just a few recent examples of data challenges CompuMark is tackling:

BREXIT—When the UK leaves the European Union, nearly 1.2 million EU trademark records and a similar number of International Registrations for designs and Registered Community Designs will be added instantly to the UK registers. This is an enormous content updating project involving millions of records, all with new official numbers. Adding to the complexity is the fact that it is not yet possible to predict precisely when (or even if) this work will need to be performed.
Canada — When Canada joined the Madrid System of International Registration, there were significant format changes, including Nice Classification and a new term of validity (10 years) that needed to be addressed to ensure seamless content delivery.
EUIPO format changes—As a result of changes to trademark law, the EUIPO made content format changes in late 2018; as a result, all 26 national EU offices will follow suit in 2019/20.
Chinese records were switched to single class format, with a separate record for each class. Consider that the Chinese trademark register is the world’s largest by a factor of 10, meaning a massive volume of content to be updated.

The trademark landscape will continue to expand and evolve. But CompuMark’s 90+ years of experience and trademark expertise, strengthened by significant investments in data quality, content management and leading-edge automation technology, helps ensure that we are well equipped to handle whatever challenges emerge in the years ahead.

For CompuMark customers, that boils down to one thing: confidence in a complex world.

*Source: https://www.weforum.org/agenda/2019/04/how-much-data-is-generated-each-day-cf4bddf29f/