Phrase pair classification for identifying subtopics

Sujatha Das, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

Original languageEnglish (US)
Title of host publicationAdvances in Information Retrieval - 34th European Conference on IR Research, ECIR 2012, Proceedings
Pages489-493
Number of pages5
DOIs
StatePublished - 2012
Event34th European Conference on Information Retrieval, ECIR 2012 - Barcelona, Spain
Duration: Apr 1 2012Apr 5 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7224 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other34th European Conference on Information Retrieval, ECIR 2012
Country/TerritorySpain
CityBarcelona
Period4/1/124/5/12

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Phrase pair classification for identifying subtopics'. Together they form a unique fingerprint.

Cite this