The MASC word sense sentence corpus

Rebecca J. Passonneau, Collin Baker, Christiane Fellbaum, Nancy Ide

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Scopus citations

Abstract

The MASC project has produced a multi-genre corpus with multiple layers of linguistic annotation, together with a sentence corpus containing WordNet 3.1 sense tags for 1000 occurrences of each of 100 words produced by multiple annotators, accompanied by indepth inter-annotator agreement data. Here we give an overview of the contents of MASC and then focus on the word sense sentence corpus, describing the characteristics that differentiate it from other word sense corpora and detailing the inter-annotator agreement studies that have been performed on the annotations. Finally, we discuss the potential to grow the word sense sentence corpus through crowdsourcing and the plan to enhance the content and annotations of MASC through a community-based collaborative effort.

Original languageEnglish (US)
Title of host publicationProceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012
EditorsMehmet Ugur Dogan, Joseph Mariani, Asuncion Moreno, Sara Goggi, Khalid Choukri, Nicoletta Calzolari, Jan Odijk, Thierry Declerck, Bente Maegaard, Stelios Piperidis, Helene Mazo, Olivier Hamon
PublisherEuropean Language Resources Association (ELRA)
Pages3025-3030
Number of pages6
ISBN (Electronic)9782951740877
StatePublished - 2012
Event8th International Conference on Language Resources and Evaluation, LREC 2012 - Istanbul, Turkey
Duration: May 21 2012May 27 2012

Publication series

NameProceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012

Other

Other8th International Conference on Language Resources and Evaluation, LREC 2012
Country/TerritoryTurkey
CityIstanbul
Period5/21/125/27/12

All Science Journal Classification (ASJC) codes

  • Linguistics and Language
  • Language and Linguistics
  • Education
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'The MASC word sense sentence corpus'. Together they form a unique fingerprint.

Cite this