Comparative study of name disambiguation problem using a scalable blocking-based framework

Byung Won On, Jaewoo Kang, Dongwon Lee, Prasenjit Mitra

Research output: Contribution to journalConference articlepeer-review

87 Scopus citations

Abstract

In this paper, we consider the problem of ambiguous author names in bibliographic citations, and comparatively study alternative approaches to identify and correct such name variants (e.g., "Vannevar Bush" and "V. Vush"). Our study is based on a scalable two-step framework, where step 1 is to substantially reduce the number of candidates via blocking, and step 2 is to measure the distance of two names via coauthor information. Combining four blocking methods and seven distance measures on four data sets, we present extensive experimental results, and identify combinations that are scalable and effective to disambiguate author names in citations.

Original languageEnglish (US)
Pages (from-to)344-353
Number of pages10
JournalProceedings of the ACM/IEEE Joint Conference on Digital Libraries
DOIs
StatePublished - 2005
Event5th ACM/IEEE Joint Conference on Digital Libraries - Digital Libraries: Cyberinfrastructure for Research and Education - Denver, CO, United States
Duration: Jun 7 2005Jun 11 2005

All Science Journal Classification (ASJC) codes

  • General Engineering

Fingerprint

Dive into the research topics of 'Comparative study of name disambiguation problem using a scalable blocking-based framework'. Together they form a unique fingerprint.

Cite this