Sense-aware semantic analysis: A multi-prototype word representation model using wikipedia

Zhaohui Wu, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

32 Scopus citations

Abstract

Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific" prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluation on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
PublisherAI Access Foundation
Pages2188-2194
Number of pages7
ISBN (Electronic)9781577357018
StatePublished - Jun 1 2015
Event29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015 - Austin, United States
Duration: Jan 25 2015Jan 30 2015

Publication series

NameProceedings of the National Conference on Artificial Intelligence
Volume3

Other

Other29th AAAI Conference on Artificial Intelligence, AAAI 2015 and the 27th Innovative Applications of Artificial Intelligence Conference, IAAI 2015
Country/TerritoryUnited States
CityAustin
Period1/25/151/30/15

All Science Journal Classification (ASJC) codes

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Sense-aware semantic analysis: A multi-prototype word representation model using wikipedia'. Together they form a unique fingerprint.

Cite this