Strengths and Weaknesses of Search Sites

The following is an analysis of different IP search sites currently available.

Selected National Patent Offices

IP Australia

The site is difficult to navigate because it is in multiple sections with patents from different groupings available either in the web interface or via a downloaded software client:

  • PatsearchPatsearch  (running on the Patent Administration and Management System – PAMS database): This database is accessible online, containing bibliographic and legal status information on innovation patent applications filed from 24/05/2001 and complete and provisional patent applications filed from 05/07/2002. Search fields for Patsearch include:
    • Patent number
    • Applicant/inventor name
    • Patent title
    • Option to exclude lapsed/withdrawn/ceased/expired documents
    • Option to exclude PCT applications and documents with non-Australian priority data
    • Option to restrict publication/filing date range
    • IPC
  • Patent Administration System
    PatAdmin, part of the Patents Mainframe Databases.

    This database is accessible via a software client that must be downloaded and installed on a local machine from a different page on the IP Australia website. PatAdmin provides bibliographic and legal status information of patent documents (applications and granted patents) filed between 01/1979 to 04/07/2002. Search fields for PatAdmin (via the software client) include:

    Application ID
    Patent number
    Provisional number
    PCT number
    WIPO number
    Applicant/inventor name
    Option to restrict date range

  • Patent Indexing System

    PatIndex, part of the Patents Mainframe Databases

    This database is also accessible via a software client downloadable from the IP Australia website, and the title and number of patent documents (applications and granted patents) filed between 01/1979 to 04/07/2002, and seems to cater particularly for searchers to find prior art in these patent documents by using the IPC code. Patent titles and corresponding document numbers are searched by one or maximally two codes, connected by AND, OR or NOT.  The resulting list can be narrowed down further by either conducting more IPC code searches, restricting the application year, excluding the lapsed/ceased patent documents, and using keywords in the bibliographic information (e.g. word in title or applicant).

    • AU published patent data (AAPS) searching main page : This database is accessible online, containing full text images of published AU patent documents from 1975 and bibliographic information of AU patent documents between 1920 and 1974 (data incomplete). A ‘quick’ search for patent documents can be done by:
      • Application number
      • Patent number
      • Title

    The ‘Advanced’ search provides searching for terms, dates, and numbers stated in the bibliographic information.  Two terms/dates/numbers are available per search.

  • Patent Specifications Main PageThis database is accessible online, containing full text images of AU-A and AU-B patent documents from 17/12/1998 (these documents have navigation to the abstract, description, drawings and claims in the documents) and AU documents that did not enter via the two international routes (Paris and PCT) from 1975 (these do not have navigations to the separate sections in a patent document). Search for patent documents can be done by:
    • Application number
    • Patent number

    Application/serial number concordance main page

    This search finds corresponding numbers of patent documents that have been renumbered from the Patent Mainframe database system when they were moved to the PAMS database, using the application number (old or new system) or the patent serial number.

  • Conclusion

    None of the patent search/retrieval engines offer a term search within particular fields of document text, i.e. claims, description, full text. Searching for information on the Patents mainframe can be very difficult for a public, non-examiner user because of the requirement for knowledge of the commands used on the telnet interface.   Full understanding of the coverage of each database (Patents mainframe and PAMS) is needed for the searcher to be able to retrieve information on particular documents.

    IP Australia’s website acknowledges that there are issues with data completeness and quality and a need to consult several distinct databases to conduct basic searches.   It is further necessary to go to INPADOC for much of the status information, and we found many issues with accuracy and ongoing updates in the INPADOC data.

    However, the mere fact that data are distributed over several systems during periods of redevelopment need not in itself impact negatively on public use of that information. For example, the Australian Trade Mark Online Search System (ATMOSS) has enhanced public access, while making use of data input and is maintained on a legacy mainframe system as well as a more recent web-based on-file filing system.
    From the perspective of Australian innovators, it would be desirable to see:

    • A single integrated interface that allows “one stop” authoritative searching of Australian patent documents, as well as legal status, and file wrappers
    • Full text searching of the claims and specifications sections of Australian patents

European Patent Office (EPO)

The EPO site provides wide data coverage with legal status and family data (INPADOC) of patent documents for over 40 jurisdictions. INPADOC patent family documents are summarised by patent number and not presented separately by publication (e.g. EP123456-A1 and EP123456-A2 are summarised as one patent document with A1 and A2 information contained in a single file).

A major drawback of the EPO public search site is that it doesn’t support full text searching. Searches can only be done for one database at a time and no term searches can be done in claims or full text.  For non-English patent documents, searching for terms in the title or abstract cannot be done and no manual Boolean search options are available.  Furthermore, ‘description’ and ‘claims’ fields in the individual document information page are sometimes replaced by a corresponding WO or EP document if the information from the actual jurisdiction is not available. This can be misleading if you want to conduct claims analysis on a granted patent for a particular jurisdiction.

The EPO site does allow bibliographic searching of patent documents from a number of jurisdictions as well as to patent family and legal status data through INPADOC. However, EPO states that the INPADOC legal status is incomplete or not up to date, and therefore the information provided on the document information page may not be accurate. This same limitation applies to all providers that use the INPADOC data purchased from the EPO.

The epoline® Online Public File Inspection service  gives access to prosecution history and status.
In addition, the EPO has been active in the development of software, such as EPOQUE Net, for the internal use of national offices as a revenue generation mechanism set up in co-operation with the private sector.5

5http://www.empolis.com/en/20D6DCCC63D14F7F80C2163B990DBFA5.php

United States Patent and Trade Mark Office (USPTO)

The USPTO was one of the first offices to improve their public search functionality through initiatives such as web based full text searching.  The USPTO also allows subscriber based FTP access to a comprehensive range of timely, high quality machine readable IP data, enabling third parties to offer innovative services that add functionality beyond that offered by the USPTO public search site.   For example, this is the mechanism by which CAMBIA obtains full US patent and US application data.

The USPTO site is best for quick viewing or capture of text, although only US data is available here.  The search interface is simplistic, with sophisticated searches being difficult, if not impossible.  To view images requires a specialised viewer of TIFF images.  The output is a long list of patent numbers and titles, with the corresponding links leading to full text.

The USPTO provides online access to status information though the Patent Application Information Retrieval (PAIR) system, and there is a database of assignees that is updated regularly.

Japanese Patent Office (JPO)

The JPO has an effective public search interface for users of Japanese language, but provides only limited English language access support .  As an example, for design patents, the English interface only provides designs search by registration number or application number of rejected designs, whereas the Japanese interface provides additional search options, including the text search option that was introduced with the incorporation of the electronic publication of registered designs from January 2000.

The JPO has been very active in the area of machine translation (MT) with a view to facilitating access in Japanese to patent documents published in European languages.  There is less accessibility in the other direction. Electronic documents (1993 onwards) have machine translations of Japanese patent documents into English, which are of incomplete quality and still under development but sufficient for a person skilled in the art to be able to understand the content.

The JPO provides term searches within claims.  Search terms cannot be easily refined. For example returning to the search page by the ‘back’ button on the browser and modifying the search term does not allow one to do another search on that page.  Instead, the page must be refreshed and all terms and fields redefined and/or re-entered.

Patent documents before 1993 only exist as scanned images, so the claims cannot be accessed without obtaining the entire document. Searching for terms in full text is not available (possibly because the pre-1993 documents have not been converted into electronic format). Manual Boolean expression search (to create a personalised Boolean expression) is not available.

We found that inventor names were missing from certain PCT applications (possibly all PCT applications published before 1993) that entered national phase in Japan, making it difficult to search for patents using inventor names (PDF files of PCT applications published after 1993 also do not seem to have the inventor name stated in the national phase application).

Brief legal status is provided for Patent Abstracts of Japan (PAJ) documents where available.

State Intellectual Property Office (SIPO) China

Use of this site is free for downloading patent documents, with images available as TIFF files, and legal status can be obtained, as at the CNIPR site (see below), but users may need to download a software navigator to view Chinese patent specifications and the site is slow in retrieving data.  SIPO also offers China Intellectual Property Net (CNIPR), which offers a Boolean search option with capabilities to refine the search.  It is also possible to search legal status by application number or publication date .

CNIPR provides information about whether or not a patent is granted or lapsed, and information on whether or when a patent application has been requested for examination, is abandoned, or deemed as withdrawal, etc.  A synonym search for possible words with same or similar meaning is also available.

However, the site is not free – a patent search at CNIPR site requires registration and the purchase of reading cards, despite the fact that searching (without downloading) for Chinese and international patents is free for cardholders.  Reading cards can be purchased at three levels – cards valued at 500, 1000 and 3000 Chinese Yuan (approximately $80, $165, $485 Australian dollars respectively). These cards allow users to download 500, 1250 and 6000 pages of Chinese patent specifications respectively, or 0, 5000 and 20,000 international patent documents respectively.

The searchable parts of a patent at both sites are limited to the bibliography page including:

  • Publication Number
  • Publication Date
  • Title, Address
  • Application Number
  • Application Date
  • Inventor Name
  • Abstract
  • International Classification
  • Applicant(s) name
  • Attorney/Agent.

Korean Intellectual Property Office (KIPO)

A search facility in English is offered via that Korean Intellectual Property Institute. At this site, only the interface for “Patent & Utility Model Search” is translated into English, while more is available for users of the Korean language to perform keyword searches. It is possible to retrieve documents by using English search terms only if the title or abstract contains English words.

There is also a Korean Patent Abstract (KPA) search with the interface in English, but only for Korean patents published with English abstracts, from 1979 for examined patent publications and 2000 for unexamined patent publications.   The latter does, however, provide legal status information in English.

World Intellectual Property Office (WIPO)

WIPO’s data coverage is small, with PCT documents available only from 1997, severely limiting its usefulness.  One advantage though, is the links to other documents such as the examination report and priority application(s). However, international reports can only be accessed as the PDF files, less readily searchable.

WIPO recognises the limitations of its data as a resource for finding prior art and improving the quality of patents internationally, and is working to improve this. WIPO has added full text search functionality for recent PCT applications to its on-line site and further improvements are promised. For the subset of applications that have this search functionality, terms can be searched within full text and claims. Terms can also be searched in French.

WIPO employees have been cooperating with CAMBIA and recognise consonance with the goals of CAMBIA’s BiOS Initiative.6  On the WIPO site, as on CAMBIA’s Patent Lens, it is possible to manually create a Boolean expression and it has the Boolean operator ‘NEAR’. The period range can be specified (by weekly increments only). Electronic descriptions are available and the claims can be readily accessed.

For status information, each individual document has a link to a chart describing the national phase entry deadline for PCT member States.


No- Charge Providers

Espacenet

While this site has improved considerably over the years, the extent of coverage has not increased significantly and substantial disadvantages limit its usefulness.  Notably, the results list is disorganised; no obvious parameters dictate the order of display, nor is there any ability to impose some order on it.  Thus, unless the search is limited to just the EP data, it is a formidable task to wade through the results.  In addition, full PDFs are not yet available for download.

Surf IP

This site, which has both a free-of-charge and a “premium” fee-based component, is maintained by the Singapore government and advertises a large data set with access to at least nine national patent office databases and other technical, business and search engine sources. A few datasets are not found elsewhere (Singapore; Taipei and Thailand). It was also of particular interest for access to the Korean and Japan patent office databases, but no documents from those databases could be retrieved during our trials (repeated error messages on the results page). The information retrieval speed of this site is extremely slow (over a minute per search), and in some sessions every search attempted resulted in either a time-out or no results.

Although with every query we found that irrelevant patent documents were retrieved, our analysis of the sources of these problems may be instructive for IP Australia if consideration is being giving to Boolean interface design.  We found that many such errors were due to a failing of the Boolean operator bracket function, which resulted in difficulties specifying combinations of search terms.  Also, as in Google’s default search, structured search terms do not seem to be connected by Boolean AND, but rather OR, which pulls out documents that do not have one or the other term, particularly in documents from the EPO and UK Patent Office.  There was, unfortunately, also no wildcard option available.

CAMBIA Patent Lens

For the purposes of this report we sub-contracted an expert US patent attorney to review the CAMBIA Patent Lens. The consultant’s opinion was that compared to other no-cost providers, this site is easier to use and better organised, consisting of a well laid out input page, with search capabilities of multiple fields, and a clean results page that can be sorted on a number of parameters.

The consultant remarked that the coverage of countries is good and the ability to search full text documents is excellent – although clearly it would be desirable to extend the coverage of countries.  Notably, Australian data was only available for a short period, a shame because the site is one of very few that can provide full text searching of Australian data.

Currently the site is also limited (by IPC classification) to life science documents, but OCR is underway to provide patent documents in all fields within the next six months.

CAMBIA’s Patent Lens uses a fast and powerful query language developed in house for the specific purpose of full text patent searching.  The same query language is used in the modification that was made earlier this year to incorporate INPADOC data.  Due to proprietary restrictions around many other search engines, we are able to give a high level of detail only about this Canberra-developed language, Dekko, in Chapter 2 of this report.

A significant advantage of the Patent Lens is the ability to view the full patent document, which can be seen at the user’s option as either text with highlighted search terms or PDF image.  Giving the user options is part of the site ethos, and the developers are contactable by users if bugs or suggestions are to be communicated.  Relevance ranking is in development and this also will be user-configurable.


Commercial Providers

Thomson-Derwent Providers

The Thomson Corporation, a Canada-based electronic publishing conglomerate, has consolidated its position as a key provider of patent information through the acquisition of its competitors. In 1984, it purchased all remaining shares in Derwent, a key provider of patent databases.  In 2000, it purchased Dialog , in 2002 it purchased Delphion, and in 2004 it acquired Micropatent, one of its main competitors. There have also been numerous smaller acquisitions along the way.  As well as consolidating its Intellectual Property services, Thomson has continued to expand into the sciences, health and legal publishing and online databases.

Although Thomson-Derwent has acquired several of these larger data providers in recent years, thus far they have not unified many of the sites. Each of the individual sites has a different format, different set of data, different interface and different pricing.  We chose the major sites mentioned above for our analysis.

Delphion

Delphion’s speed has dramatically improved but the search interface is still clunky, and searching for a document by its number is a lesson in all the ways one can make “mistakes” formatting.  It is possible to limit a search by date range, use wildcards, and search terms near each other.  The results list can be highly modified in fields that are shown and there is standardization of assignee (applicant) names.  A limitation is that the results are truncated at 500 results.  When a document is viewed, post-grant status is available.  Some analytical tools of varying utility are available, though some at extra cost, for e.g. graphical citation tools.

Delphion does carry the human-edited Derwent titles and abstracts database, and features forward and backward citation linking. It also clusters results by subject, but it is not clear on what criteria this clustering is based.

DialogPro and DialogWeb

The “Web” version is actually more sophisticated; for example, complex Boolean search strategy can only be implemented on the web version.  DialogWeb also has many more non-patent databases that can be accessed for a price.  Data coverage in the Web version includes China (bibliographic data and abstracts in English), as well as post-grant status and litigation information.  Several useful options for sorting the output are found in the Web version.  Some may find its command driven interface challenging.

Micropatent

Only the Patent Web portion was assessed, because each area requires a separate subscription.  The search interface allows for some rather sophisticated search strategies, such as using ECLA codes, complex Boolean structure, limiting searches by publication date range and using a wildcard for truncation.  A command-driven interface that would allow increased flexibility is available at a higher subscription fee.  Results are returned quickly and refining searches is easily done from the search history interface.  The site annoyingly opens new windows for everything, making navigation difficult.


Independent Fee-Based or Subsription-Based Providers

Get-the-patent

This site was set up mainly to access document images for a low price with input of a patent number, but it offers some searching capabilities as well.  The interface allows complex Boolean structure but limits a search to a single year and the search engine query language seemed in many of our searches to come up with bugs.   Searching can be done by IPC codes, but not in all databases.  The datasets are fairly extensive, but only the US is full text.  Its strength still remains bulk downloads of patent documents.  The bugs take this site out of the race when it comes to serious searching, though it could still be useful for downloading full images, which are in a proprietary format 1/6th the size of PDFs.  However, these images are provided in a proprietary format (not PDF) that requires its own viewer.

Lexis

This service has an exemplary interface for inputting Boolean searches.  Search terms can be applied to just about every field of a patent document, though it is challenging even for an expert patent searcher to find the help for formatting of some fields like IPC.

Unfortunately the output of results was not exemplary, with no formatting of the results list possible, and access is expensive.

Minesoft PatBase

This engine was originally developed for a specific client and is now available to others on a fee basis. It is a powerful search site, though unfortunately navigation of the site is quite challenging and there are some glitches and slowness.

While it allows wildcards, complex search structure, etc., it has a very useful browse function on names of inventors and assignees.  Using this function, it is possible to capture documents by a group or individual regardless of bad spelling and typos and inconsistencies of the name (e.g. IBM = International Business Machine; Du Pont = DuPont).7

Like the step searches of WIPS, initially the results come back as the number of results.  A quick view of titles reveals if the search is on the mark or not.  The output is thorough, with displayed fields highly controllable.

7This would be quite valuable for a public user interested in freedom to operate considerations and CAMBIA is looking into a similar function for the Patent Lens.

PatentCafe

The site uses latent semantic indexing searches as an alternative to Boolean searches. In the latent semantic search, a query consists of a sentence or a paragraph, from which the concept of the sentence or paragraph is extracted to search for relevant patent documents. When our patent searching experts used text from claims to search, we found that every search tended to return a very large number of documents, the relevance of which was sometimes dubious.  It also missed some of the known relevant patents where these used a somewhat different claims vocabulary.

However, the public user can benefit by a feature of the latent semantic indexing search that the patent results list page provides a set of ‘related terms’ (synonyms) generated from the context around the initial query words for each patent document that has been retrieved. This can be useful to further exclude/include patent documents on the list.  Unfortunately, it doesn’t also return the original query for editing, so refining search queries is less straightforward than for typical Boolean expressions.

Another useful feature is that the individual document information provides ‘patent references cited’ and ‘cited by’ (this seems only available for US patent documents and does not guide properly to WO documents, probably because of the metadata marking).8

The website has some glitches, for example optional fields that proved to be mandatory.  It doesn’t work well with all browsers and all operating systems, and there is no provision of INPADOC family or status information, or prosecution history.

8As such marking is not greatly different from other features the Patent Lens already incorporates, this is a feature that CAMBIA is exploring.

Questel-Orbit

The search input on this French company’s site is quite good; searches by IPC, ECLA and date ranges are easily set up, as well as complex Boolean structure searches.  Search terms are readily limited to be near each other or ordered.  A search history allows combining prior done searches using Boolean connectors. This site also has more extensive data coverage (countries) than the others, legal status data, and value-added patent family data.

The weakness of this site is the results output; it is fixed order by patent number, although the results can be downloaded in a variety of formats.

One very nice feature in the output comprises links that allow the searcher to move through the database forward or backward in time following patent citations, a feature that CAMBIA is considering for incorporation into the Patent Lens.

Software for Intellectual Property (SIP)

This is a German site that has progressed over time from specializing in patent family data to a search engine with several features (e.g. ability to search using IPC codes, “near” searches, and even “fuzzy” searches for similar sounding words).

The user interface offers Boolean and structured query facilities that simply do not work, so we weren’t able to perform our test queries.

The output format is not malleable, but it integrates well the family and legal data with the patents.

PatentWarehouse

Univentio is a company with long history in the IP information industry. For example, Univentio produced the “WIPO/PCT patents fulltext database” used by Thomson Dialog.  Univentio has recently come under the control of LexisNexis, which has not yet been able to provide new subscriptions to the Univentio patent search site, the PatentWarehouse, although this is probably a transitory problem which will soon be addressed under the new ownership arrangements.  Accordingly, the following information was obtained from documentation rather than actual use.

The PatentWarehouse database contains intellectual property information, including full text of patent applications and granted patents, bibliographic information of patents, utility models and designs.  Full text of patent documents are covered for US, EP, GB, JP, FR, DE, AU, AT, BE, CA, DK, LU, MC, NL, NO, PT, ES, SE and CH, and bibliographic information is covered for 70 jurisdictions.  Five search options are available, including searches by complex Boolean expressions.  The results page provide a list of retrieved patent documents with an indication of legal status using ‘Trafficlight’, a unique feature provided by PatentWarehouse.  Patents that are in force (red), withdrawn/lapsed (green), or possibly in force (yellow) can be identified for each patent document with the three different colour codings.  The claimed coverage of full text Australian patents is noteworthy.

WIPS

This newcomer is run by a Korean company and uses an interface and output reminiscent of Delphion but with improvements. The search input is flexible and allows wildcards, near searching, date ranges, and assignee name standardization.  The output features a “step search”, which outputs only the number of results; when a desired number is returned, the list of results is easily viewed.  Data coverage is standard, with only US and EP documents provided as full text.  Ability to modify the results lists is complemented by ability to download in txt, xls, or mdb file types.  Clustering results by IPC, applicant, application date, patent date and keyword are helpful.

One downside to this site is that to view images, proprietary software is required, which works only with certain browsers, and some graphical presentations require a Java Virtual Machine to be installed in the browser.  We also found a number of bugs in the search. For example, full text search for words within a claim appears not to have worked.


Non-Patent Search Sites (for prior art)

Non-Patent Searches

For any type of prior art search, whether it be an FTO search for a particular technology or a patentability search for an invention, databases that contain non-patent or scientific literature are important.  This is because a vast amount of technology development is published in scientific journals  and in many art areas there is a substantial amount of publication that predates patent literature.

The examiners we spoke to indicated considerable use of general search engines such as Google for this purpose, which can certainly deliver large amounts of information, but is not relevance ranked for patent searcher needs, and extremely relevant documents can be missed. Citation databases contain literature that has been selected for publication by relevant bodies, the selection having been performed at least ostensibly by “persons skilled in the art”.

To represent the many available free and paid databases, we chose one of each.  Here PubMed and Current Contents Connect are briefly explained and compared.  However, we note that many other databases are indeed available, and our comparison in this report concentrates on typical features.

Before entering into this comparison, we also mention WIPO’s efforts to create a common dataset of journals that can be used as sources of prior art in specialised fields, the Journal of Patent Associated Literature, JOPAL.  Currently this supplies only bibliographic data such as titles, authors and dates, but is the closest non-paying alternative to Current Contents Connect and is set up specifically for patent searchers.

PubMed

PubMed is a free search engine maintained by the National Library of Medicine (NLM) at the National Institute of Health (NLH), and is accessible through the National Center for Biotechnology Information (NCBI) website. The databases that are covered by PubMed are:

  • MEDLINE – database with over 12 million citations from over 4800 journals in the biomedical field from the mid 1960s to current.
  • OLDMEDLINE – database containing 2 million citations from biomedical journals around the world between 1950 and 1965.
  • In Process Citations – bibliographic information and abstract of articles not yet added to MEDLINE (record updated daily between Tuesday and Saturday).
  • Publisher Supplied Citations – citations received in electronic format from publishers.
  • ‘Out-of-scope’ citations – from general science and chemistry journals that index life science citations to MEDLINE full text articles submitted to PubMed Central by additional journals in the life science field.
  • MeSH: Medical Subject Headings, a database containing a vocabulary thesaurus that is used for indexing articles in PubMed (similar to the IPC, but without codes). The MeSH database link finds the MeSH term based on a search, which then can be used as ‘term [MeSH]’ to search for articles in that particular MeSH category.  MeSH terms can be combined, as well as narrowed or broadened from the initial MeSH term that was found from a term search.
  • Journals Database

A basic search for retrieving citations is conducted with a single search box.  Words in the following fields can be searched, often without the need of defining the field:

  • Key concepts
  • Author names (if the name is also a subject word, a search tag [au] can restrict the search to within authors)
  • Journal title

Searches can be restricted in the ‘Limits’ option by using criteria such as language, publication type, publication date, and MeSH terms.  Other options that are available include:

  • Search for phrases
  • Wildcard ‘*’
  • Combining searches in ‘History’
  • ‘Single Citation Matcher’ – to identify a specific article using bibliographic information
  • Index of terms in ‘Preview/Index’
  • Boolean operators AND, OR, NOT (default is AND)
  • ‘Related Articles link’ to retrieve a set of articles relevant to the area of the article of interest, listed in the order of relevance
  • Preview/Index – displays a quick result of the number of hits from a particular search, or provides an index of terms within a field based on a term search.
  • ‘History’ provides function to combine search results
  • There are special search categories and queries designed for medical practitioners and health service researchers

Search results are displayed with the author name(s), article title, journal name, issue and page, and the PubMed ID.  The author name(s) is linked to the full bibliographic information including the abstract for those available (1975 onwards).  Each hit has a link to either the abstract, free full text from the journal or free full text in PubMed Central (PMC), ‘Related Articles’ link to retrieve articles relevant to the area of the article (listed in the order of relevance), and ‘Links’ to other various resources or NCBI databases.  The ‘Link out’ button in the ‘Links’ section provides links outside of NCBI that are related to the article.

Isiknowledge

General
This paid citation search engine is part of ISI Web of Knowledge, provided by Thomson Scientific (a Thomson Corporation group).  The database contains bibliographic information on articles from approximately 7600 journals and 2000 books published since 1998.  Areas of research that are covered are agriculture, biology and environmental sciences, social and behavioural sciences, clinical medicine, life sciences, physical, chemical and earth sciences, engineering, computing and technology, arts and humanities, business collection, and electronics and telecommunications collection.  The search interfaces are user-friendly with available tag keys provided for all search pages, a table of available field tags and Boolean operators on the Advanced search page.  The results are compiled in a list with the number of hits for each search, and the information from the retrieved articles (titles, author, abstract, and full text where available) is also easily obtained.

ISI Web of Knowledge contains the following main search engines:

  • Web of Science – search engine with database containing articles in life sciences (Science Citation Expanded®, 1945-), social sciences (Social Sciences Citation Index®, 1956-), arts and humanities (Arts and Humanities Citation Index®, 1975-), chemistry (Index Chemicus®, 1993-; Current Chemical Reactions®, 1986 and INPI (Institut National de la Propriete Industrielle) archives between 1840 and 1985), and the Century of Science initiative with files backdating to 1900.
  • Current Contents Connect – search engine with database containing information on articles from 7600 journals and 2000 books published since 1998.
  • ISI Proceedings – database containing conference proceedings in the science, social science and humanities fields.
  • Derwent Innovations Index – database containing over 23 million patents from 1963 from 40 jurisdictions.

In Current Contents Connect, information on publications from the following areas of research are available10 (upon subscription to each area) for searching from 1998 to present:
Agriculture, biology and environmental sciences:

  • Social and behavioural sciences
  • Clinical medicine
  • Life sciences
  • Physical, chemical and earth sciences
  • Engineering, computing and technology
  • Arts and humanities
  • Business collection
  • Electronics and telecommunications collection

Search options

There are four search options:

    1. Classic search Search by terms or phrases.  Boolean operators AND, OR, NOT, can be used to combine terms within a particular field (selected from a range of bibliographic information, or topic subject).  Search results that are compiled as a list with the number of hits can be combined with AND or OR to further narrow/broaden the search.
    2. General search

Search by terms or phrases in the following fields (an index is provided for each field for browsing):

      • Topic/subject
      • Author/editor
      • Group author
      • Source title (journal title)
      • Address (author’s institution)

Terms can be connected with Boolean operators AND, OR, NOT, SAME, and language and document type can be restricted. This search option is the only one that the search history cannot be accessed and edited, although a search conducted with this option will be included in the history.

  1. Advanced search Search by creating a Boolean expression.  Operators AND, OR, NOT, SAME can be used to connect terms, and field tags can be added to each term to define the field (list of tags provided on the search page).  Terms can be bracketed to indicate priority searching.
  2. Browse Search for articles by browsing by journal titles (linked in alphabetical order) or by research area.

Results page

The results of searches is listed in ‘history’, from which retrieved articles of a particular search can be viewed by clicking on the number of hits on the search results list.  Retrieved articles are presented with the author name/s, title, abbreviated journal name, issue, page and publication year.  Each article title is linked to detailed bibliographic information of the article, including the abstract.  Access to full text is indicated under each article.  Bibliographic information of selected articles can be retrieved and transported to personal citation management softwares such as EndNote and Reference Manager (both by Thomson).

Traditional Knowledge

Although there has been heated discussion in the intellectual property community of the importance of averting IP protection awards that cover genetic material and processes that have long been known to traditional communities, at this point there is little access to traditional knowledge databases that may contain substantial prior art.  Efforts to overcome this deficiency are being made at SIPO, the USPTO and WIPO.

WIPO has recently reviewed a wide range of traditional knowledge-related periodicals11 and agreed that thirteen of these should be added to the list of published items of non-patent literature forming part of the PCT minimum documentation under Rule 34. The Meeting also recognised the importance of identifying further traditional knowledge-related databases suitable for use in international searches and agreed that the issues involved should be considered as part of a comprehensive review .  WIPO is developing a Search Guidance Intellectual Property Digital Library (SGIPDL) which when available could be of assistance to (it is currently being reviewed by a task force of representatives from the International Searching Authorities).

“Value-added” Information Summary

Various patent information providers claim to add value to patent information in the following ways:

  • Some providers facilitate searching by inventors and assignees through normalisation of variations in a person’s or company’s name (e.g. CSIRO = C.S.I.R.O.) to a single standardised form.
  • Several sites use a common source of revised English titles and abstracts for patent documents in English as well as many other languages, on sale from Derwent. The Derwent titles and abstracts can be more informative than the originals, since the original titles are often minimal and the abstracts often don’t represent the claimed matter well.  Skilled users of IP data should not rely on title and abstract searches, but given that much searching is still performed on titles and abstracts rather than full text, this service to make at least some of them somewhat more informative can be viewed as adding some value.
  • ECLA and IPC classifications are technology-based and designed primarily for patent examiner needs, and services tend to use one or the other or both, although some services provide neither. Derwent provides an industry based classification system to complement them, intended to satisfy the needs of patent information users who deal extensively with industry classification, primarily sophisticated patent attorneys;  it seems to be used less often by examiners.  We found that none of the three classification systems would be used extensively by the public technology searcher because of the breadth of the categories;  full-text searching with user-configurable queries was preferred.
  • Some providers allow searching of patent and non-patent technical literature at the same site, though few do this in a very integrated manner because the non-patent literature does not have the same formalised metadata.12 As much non-patent literature is available via the Internet, some searchers use Google to find it, but a lack of user-configurable relevance ranking can lead to failure to find the most relevant prior art;  use of specialist non-patent literature compilations such as Medline is more likely to find that which those skilled in the art would identify as prior art.  For occasional searching it is not inconvenient to use separate specialist citation databases via internet links, but ideally APIs into specialist citation databases such as Medline can be developed.
  • Very few sites provide chemical and biological sequence searching (discussed in Ch. 2).
  • At this point, there is little access to traditional knowledge databases that may contain substantial prior art, although efforts to overcome this deficiency are being made at SIPO, the USPTO and WIPO.  Public good providers will be following this with the same degree of interest as pay providers, but may be encountered with less suspicion.
  • Several providers use enhanced family data and legal status, based on INPADOC but extracting or collating other information for presentation.  It is possible, for example to cluster results by family using INPADOC and present this visually in a variety of ways.13

12One approach to do this, being explored by CAMBIA, would be to mark and link literature citations from within patent documents, which would be easiest for USPTO-sourced patent documents due to the standard field for such citations. Forward and backward navigation via citations can be very useful for freedom to operate identification of dominating patents, though not identified as crucial for examiners.

13For example, CAMBIA uses tables identifying the priority documents cited.