TY - GEN
T1 - Estimation of discourse segmentation labels from crowd data
AU - Huang, Ziheng
AU - Zhong, Jialu
AU - Passonneau, Rebecca J.
N1 - Publisher Copyright:
© 2015 Association for Computational Linguistics.
PY - 2015
Y1 - 2015
N2 - For annotation tasks involving independent judgments, probabilistic models have been used to infer ground truth labels from data where a crowd of many annotators labels the same items. Such models have been shown to produce results superior to taking the majority vote, but have not been applied to sequential data. We present two methods to infer ground truth labels from sequential annotations where we assume judgments are not independent, based on the observation that an annotator's segments all tend to be several utterances long. The data consists of crowd labels for annotation of discourse segment boundaries. The new methods extend Hidden Markov Models to relax the independence assumption. The two methods are distinct, so positive labels proposed by both are taken to be ground truth. In addition, results of the models are checked using metrics that test whether an annotator's accuracy relative to a given model remains consistent across different conversations.
AB - For annotation tasks involving independent judgments, probabilistic models have been used to infer ground truth labels from data where a crowd of many annotators labels the same items. Such models have been shown to produce results superior to taking the majority vote, but have not been applied to sequential data. We present two methods to infer ground truth labels from sequential annotations where we assume judgments are not independent, based on the observation that an annotator's segments all tend to be several utterances long. The data consists of crowd labels for annotation of discourse segment boundaries. The new methods extend Hidden Markov Models to relax the independence assumption. The two methods are distinct, so positive labels proposed by both are taken to be ground truth. In addition, results of the models are checked using metrics that test whether an annotator's accuracy relative to a given model remains consistent across different conversations.
UR - http://www.scopus.com/inward/record.url?scp=84959922810&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959922810&partnerID=8YFLogxK
U2 - 10.18653/v1/d15-1261
DO - 10.18653/v1/d15-1261
M3 - Conference contribution
AN - SCOPUS:84959922810
T3 - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing
SP - 2190
EP - 2200
BT - Conference Proceedings - EMNLP 2015
PB - Association for Computational Linguistics (ACL)
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2015
Y2 - 17 September 2015 through 21 September 2015
ER -