Discussion of Emerging Trends in Information Retrieval

This chapter analyses emerging trends in searching and accessing IP information, together with likely future directions in this field.   The chapter begins with how some of the general information retrieval and ranking algorithms are applied to patent data.  While many of the search engines that perform these algorithms are proprietary, CAMBIA is in a unique position to provide detailed information on the full text patent search engine developed by its staff in Canberra, Dekko, and aspects of its design that affect performance with respect to the major patent data sets.

We then turn to needs for searching and accessing IP information of special relevance to Australian stakeholders:

  • Optical Character Recognition (OCR) of image data
  • Searching in international databases and the languages that they contain
  • The specialised art of searching with a basis in certain types of claims, to determine coverage of patents (this section has a special focus on searching some of the types of matter claimed in many life sciences patents, such as biological sequences and chemical structures, because of the prominence of these types of claims and these art areas in Australian patenting).

One of the key factors driving the development of IR technologies has been the explosion in the quantity and diversity of machine readable information published on-line.  Another factor has been the increase in the power and storage capacity of computing hardware compared to its cost.  For example, Google allows subsecond searching across a claimed eight billion web pages. This reliability and scalability has been achieved by harnessing the power of thousands of cheap Linux PCs rather than high-end servers or mainframes.
Intellectual property searching covers a wide range of activities. A scientist interested in examples of how to use latest innovations in the field will have needs very different to an examiner charged with determining patentability. A trade mark examiner searching for evidence of use will face very different challenges from an examiner searching for deceptively similar marks. An IP officer from a biotech start-up doing a FTO search needs to have access to data from the jurisdictions of their key markets as well as jurisdictions in which their research and production takes place.

The way in which IP information is accessed by IP professions and the public has changed greatly over the years. Card indexes, microfiche, mainframe based systems and midrange systems with dedicated line or dial up access have largely been superseded by web based search interfaces. “Web services” such as the European Patent Office (EPO) Open Patent Service (OPS) , are increasingly allowing organizations and even individuals to design their own interfaces to remote databases.