SEERLAB: A system for extracting key phrases from scholarly documents

Pucktada Treeratpituk, Pradeep Teregowda, Jian Huang, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Scopus citations

Abstract

We describe the SEERLAB system that participated in the SemEval 2010's Keyphrase Extraction Task. SEERLAB utilizes the DBLP corpus for generating a set of candidate keyphrases from a document. Random Forest, a supervised ensemble classifier, is then used to select the top keyphrases from the candidate set. SEERLAB achieved a 0.24 F-score in generating the top 15 keyphrases, which places it sixth among 19 participating systems. Additionally, SEERLAB performed particularly well in generating the top 5 keyphrases with an F-score that ranked third.

Original languageEnglish (US)
Title of host publicationACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages182-185
Number of pages4
ISBN (Electronic)1932432701, 9781932432701
StatePublished - 2010
Event5th International Workshop on Semantic Evaluation, SemEval 2010 - Uppsala, Sweden
Duration: Jul 15 2010Jul 16 2010

Publication series

NameACL 2010 - SemEval 2010 - 5th International Workshop on Semantic Evaluation, Proceedings

Other

Other5th International Workshop on Semantic Evaluation, SemEval 2010
Country/TerritorySweden
CityUppsala
Period7/15/107/16/10

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'SEERLAB: A system for extracting key phrases from scholarly documents'. Together they form a unique fingerprint.

Cite this