WikiWrite: Generating wikipedia articles automatically

Siddhartha Banerjee, Prasenjit Mitra

Research output: Contribution to journalConference articlepeer-review

17 Scopus citations

Abstract

The growth of Wikipedia, limited by the availability of knowledgeable authors, cannot keep pace with the ever increasing requirements and demands of the readers. In this work, we propose Wiki- Write, a system capable of generating content for new Wikipedia articles automatically. First, our technique obtains feature representations of entities on Wikipedia. We adapt an existing work on document embeddings to obtain vector representations of words and paragraphs. Using the representations, we identify articles that are very similar to the new entity on Wikipedia. We train machine learning classifiers using content from the similar articles to assign web retrieved content on the new entity into relevant sections in the Wikipedia article. Second, we propose a novel abstractive summarization technique that uses a two-step integerlinear programming (ILP) model to synthesize the assigned content in each section and rewrite the content to produce a well-formed informative summary. Our experiments show that our technique is able to reconstruct existing articles in Wikipedia with high accuracies. We also create several articles using our approach in the English Wikipedia, most of which have been retained in the online encyclopedia.

Original languageEnglish (US)
Pages (from-to)2740-2746
Number of pages7
JournalIJCAI International Joint Conference on Artificial Intelligence
Volume2016-January
StatePublished - Jan 1 2016
Event25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, United States
Duration: Jul 9 2016Jul 15 2016

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'WikiWrite: Generating wikipedia articles automatically'. Together they form a unique fingerprint.

Cite this