Autonomous citation matching

Steve Lawrence, C. Lee Giles, Kurt D. Bollacker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

71 Scopus citations

Abstract

Scientific literature is increasingly becoming available on the World Wide Web. This paper considers the matching of citations found in different papers in order to autonomously construct a citation index from papers in electronic format. Citation indices of scientific literature have traditionally been constructed manually, partly because it can be difficult to autonomously determine if two citations refer to the same paper (citations can be written in many different formats). We present four algorithms for autonomous citation matching. The algorithms are based on edit-distance computation, word matching, word and phrase matching, and subfield extraction. The word and phrase matching algorithm obtains the lowest error rate, and the subfield algorithm is the most computationally efficient. We quantitatively compare the accuracy and efficiency of the algorithms on a number of datasets.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Autonomous Agents
Pages392-393
Number of pages2
StatePublished - 1999
EventProceedings of the 1999 3rd International Conference on Autonomous Agents - Seattle, WA, USA
Duration: May 1 1999May 5 1999

Other

OtherProceedings of the 1999 3rd International Conference on Autonomous Agents
CitySeattle, WA, USA
Period5/1/995/5/99

All Science Journal Classification (ASJC) codes

  • General Engineering

Fingerprint

Dive into the research topics of 'Autonomous citation matching'. Together they form a unique fingerprint.

Cite this