Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents

Rabah A. Al-Zaidy, Cornelia Caragea, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

164 Scopus citations

Abstract

In this paper, we address the keyphrase extraction problem as sequence labeling and propose a model that jointly exploits the complementary strengths of Conditional Random Fields that capture label dependencies through a transition parameter matrix consisting of the transition probabilities from one label to the neighboring label, and Bidirectional Long Short Term Memory networks that capture hidden semantics in text through the long distance dependencies. Our results on three datasets of scholarly documents show that the proposed model substantially outperforms strong baselines and previous approaches for keyphrase extraction.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019
PublisherAssociation for Computing Machinery, Inc
Pages2551-2557
Number of pages7
ISBN (Electronic)9781450366748
DOIs
StatePublished - May 13 2019
Event2019 World Wide Web Conference, WWW 2019 - San Francisco, United States
Duration: May 13 2019May 17 2019

Publication series

NameThe Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019

Conference

Conference2019 World Wide Web Conference, WWW 2019
Country/TerritoryUnited States
CitySan Francisco
Period5/13/195/17/19

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Bi-LSTM-CRF sequence labeling for keyphrase extraction from scholarly documents'. Together they form a unique fingerprint.

Cite this