The goal of this workshop is to foster research and development of the technology for patent corpus processing, by providing a forum in which researchers and practitioners can exchange and share their ideas, approaches, perspectives, and experiences from their work in progress.
The processing of intellectual property (IP) documents, including patents, is important in the scientific, business, and law communities. Much of the focus for patent and IP processing has been in the database and information retrieval communities, but not in the computational linguistics (CL) and natural language processing (NLP) communities.
In 2000, the first ACM SIGIR 2000 Workshop on Patent Retrieval was held. In this workshop, patent retrieval systems in use at EPO (European Patent Office) and JAPIO (Japanese Patent Information Organization) were introduced, and a number of issues related to patent retrieval (e.g., producing ontologies, cross-language retrieval, and evaluation methods) were proposed/discussed.
In 2001-2002, the NTCIR workshop (the National Institute of Informatics, Japan), which is a TREC-style evaluation forum for research and development on IR/NLP, first performed the patent retrieval task. Two years of Japanese patents (approximately 7M documents published in 1998-1999; 18GB) were used to evaluate mono/cross-lingual patent retrieval systems. In addition, approximately 17M Japanese/English parallel patent abstracts were used to evaluate the effectiveness of extracting translation lexicons.
Patent corpora are associated with a number of interesting characteristics, for which various CL/NLP techniques have promise for improving the quality of patent processing.
9:10-9:20 Welcome 9:20-9:50 A Patent Document Retrieval System Addressing Both Semantic and Syntactic Properties
Liang Chen, Naoyuki Tokuda and Hisahiro Adachi9:50-10:20 Intelligent patent analysis through the use of a neural network: experiment of multi-viewpoint analysis with the MultiSOM model
Jean-Charles Lamirel, Shadi Al Shehabi, Martial Hoffmann and Claire Francois10:20-10:40 Break 10:40-11:10 Overview of Patent Retrieval Task at NTCIR-3
Makoto Iwayama, Atsushi Fujii, Noriko Kando and Akihiko Takano11:10-11:40 Pseudo Relevance Feedback Method based on Taylor Expansion of Retrieval Function in NTCIR-3 Patent Retrieval Task
Kazuaki Kishida11:40-12:10 Term Distillation in Patent Retrieval
Hideo Itoh, Hiroko Mano and Yasushi Ogawa12:10-13:30 Lunch 13:30-14:00 Can Text Analysis Tell us Something about Technology Progress?
Khurshid Ahmad and AbdulMohsen Al-Thubaity14:00-14:30 Patent Claim Processing for Readability - Structure Analysis and Term Explanation -
Akihiro Shinmori, Manabu Okumura, Yuzo Marukawa and Makoto Iwayama14:30-15:00 Natural Language Analysis of Patent Claims
Svetlana Sheremetyeva15:00-15:30 Wrap-up
Papers should be submitted electronically in Postscript or PDF format to: fujii@slis.tsukuba.ac.jp. Submissions should conform to the two-column format of ACL proceedings and should not exceed eight (8) pages, including references. We strongly recommend the use of ACL-2003 style files, also available from the ACL-2003 website.
The subject line of the submission email should be "ACL2003 WORKSHOP PAPER SUBMISSION". As reviewing will be blind, the body of the paper should not include the names or affiliations of the authors. The following identification information should be sent in a separate email with the subject line "ACL2003 WORKSHOP ID PAGE":
Title: title of paperNotification of receipt will be emailed to the contact author.
Authors: list of all authors
Keywords: up to five topic keywords
Contact author: email address of author of record (for correspondence)
Abstract: abstract of paper (not more than 5 lines)
Submission deadline 18 April 2003 Acceptance notification 12 May 2003 Final version deadline 30 May 2003 Workshop date 12 July 2003
Makoto Iwayama Tokyo Institute of Technology / Hitachi Ltd., Japan
Atsushi Fujii University of Tsukuba, Japan
Aitao Chen University of California at Berkeley, USA
Hsin-Hsi Chen National Taiwan University, Taiwan
Sumio Fujita Patolis Corporation, Japan
Fredric Gey University of California at Berkeley, USA
Preben Hansen Swedish Institute of Computer Science, Sweden
Toshihiro Kamishima National Institute of Advanced Industrial Science and Technology, Japan
Noriko Kando National Institute of Informatics, Japan
Jong-Hyeok Lee Pohang University of Science & Technology, Korea
Mun-Kew Leong Institute for Infocomm Research, Singapore
Liz Liddy Syracuse University, USA
Isabelle Moulinier Thomson Legal & Regulatory, R & D Group, USA
Manabu Okumura Tokyo Institute of Technology, Japan
Paul Thompson Dartmouth College, USA