TY - GEN
T1 - The MASC word sense sentence corpus
AU - Passonneau, Rebecca J.
AU - Baker, Collin
AU - Fellbaum, Christiane
AU - Ide, Nancy
N1 - Funding Information:
This work was supported by National Science Foundation grants CRI-0708952 and CRI-1059312.
PY - 2012
Y1 - 2012
N2 - The MASC project has produced a multi-genre corpus with multiple layers of linguistic annotation, together with a sentence corpus containing WordNet 3.1 sense tags for 1000 occurrences of each of 100 words produced by multiple annotators, accompanied by indepth inter-annotator agreement data. Here we give an overview of the contents of MASC and then focus on the word sense sentence corpus, describing the characteristics that differentiate it from other word sense corpora and detailing the inter-annotator agreement studies that have been performed on the annotations. Finally, we discuss the potential to grow the word sense sentence corpus through crowdsourcing and the plan to enhance the content and annotations of MASC through a community-based collaborative effort.
AB - The MASC project has produced a multi-genre corpus with multiple layers of linguistic annotation, together with a sentence corpus containing WordNet 3.1 sense tags for 1000 occurrences of each of 100 words produced by multiple annotators, accompanied by indepth inter-annotator agreement data. Here we give an overview of the contents of MASC and then focus on the word sense sentence corpus, describing the characteristics that differentiate it from other word sense corpora and detailing the inter-annotator agreement studies that have been performed on the annotations. Finally, we discuss the potential to grow the word sense sentence corpus through crowdsourcing and the plan to enhance the content and annotations of MASC through a community-based collaborative effort.
UR - http://www.scopus.com/inward/record.url?scp=84906806081&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906806081&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84906806081
T3 - Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012
SP - 3025
EP - 3030
BT - Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012
A2 - Dogan, Mehmet Ugur
A2 - Mariani, Joseph
A2 - Moreno, Asuncion
A2 - Goggi, Sara
A2 - Choukri, Khalid
A2 - Calzolari, Nicoletta
A2 - Odijk, Jan
A2 - Declerck, Thierry
A2 - Maegaard, Bente
A2 - Piperidis, Stelios
A2 - Mazo, Helene
A2 - Hamon, Olivier
PB - European Language Resources Association (ELRA)
T2 - 8th International Conference on Language Resources and Evaluation, LREC 2012
Y2 - 21 May 2012 through 27 May 2012
ER -