Abstract
Extractive multi-document summarization is mostly treated as a sentence ranking problem. Existing graph-based ranking methods for key-sentence extraction usually attempt to compute a global importance score for each sentence under a single relation. Motivated by the fact that both documents and sentences can be presented by a mixture of semantic topics detected by Latent Dirichlet Allocation (LDA), we propose SentTopic-MultiRank, a novel ranking model for multi-document summarization. It assumes various topics to be heterogeneous relations, then treats sentence connections in multiple topics as a heterogeneous network, where sentences and topics/relations are effectively linked together. Next, the iterative algorithm of MultiRank is carried out to determine the importance of sentences and topics simultaneously. Experimental results demonstrate the effectiveness of our model in promoting the performance of both generic and query-biased multi-document summarization tasks.
Original language | English (US) |
---|---|
Pages | 2977-2992 |
Number of pages | 16 |
State | Published - 2012 |
Event | 24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India Duration: Dec 8 2012 → Dec 15 2012 |
Other
Other | 24th International Conference on Computational Linguistics, COLING 2012 |
---|---|
Country/Territory | India |
City | Mumbai |
Period | 12/8/12 → 12/15/12 |
All Science Journal Classification (ASJC) codes
- Computational Theory and Mathematics
- Language and Linguistics
- Linguistics and Language