TY - GEN
T1 - Defeating tyranny of the masses in crowdsourcing
T2 - 4th International Conference on Decision and Game Theory for Security, GameSec 2013
AU - Kurve, Aditya
AU - Miller, David J.
AU - Kesidis, George
PY - 2013
Y1 - 2013
N2 - Crowdsourcing has emerged as a useful learning paradigm which allows us to instantly recruit workers on the web to solve large scale problems, such as quick annotation of image, web page, or document databases. Automated inference engines that fuse the answers or opinions from the crowd to make critical decisions are susceptible to unreliable, low-skilled and malicious workers who tend to mislead the system towards inaccurate inferences. We present a probabilistic generative framework to model worker responses for multicategory crowdsourcing tasks based on two novel paradigms. First, we decompose worker reliability into skill level and intention. Second, we introduce a stochastic model for answer generation that plausibly captures the interplay between worker skills, intentions, and task difficulties. This framework allows us to model and estimate a broad range of worker "types". A generalized Expectation Maximization algorithm is presented to jointly estimate the unknown ground truth answers along with worker and task parameters. As supported experimentally, the proposed scheme de-emphasizes answers from low skilled workers and leverages malicious workers to, in fact, improve crowd aggregation. Moreover, our approach is especially advantageous when there is an (a priori unknown) majority of low-skilled and/or malicious workers in the crowd.
AB - Crowdsourcing has emerged as a useful learning paradigm which allows us to instantly recruit workers on the web to solve large scale problems, such as quick annotation of image, web page, or document databases. Automated inference engines that fuse the answers or opinions from the crowd to make critical decisions are susceptible to unreliable, low-skilled and malicious workers who tend to mislead the system towards inaccurate inferences. We present a probabilistic generative framework to model worker responses for multicategory crowdsourcing tasks based on two novel paradigms. First, we decompose worker reliability into skill level and intention. Second, we introduce a stochastic model for answer generation that plausibly captures the interplay between worker skills, intentions, and task difficulties. This framework allows us to model and estimate a broad range of worker "types". A generalized Expectation Maximization algorithm is presented to jointly estimate the unknown ground truth answers along with worker and task parameters. As supported experimentally, the proposed scheme de-emphasizes answers from low skilled workers and leverages malicious workers to, in fact, improve crowd aggregation. Moreover, our approach is especially advantageous when there is an (a priori unknown) majority of low-skilled and/or malicious workers in the crowd.
UR - http://www.scopus.com/inward/record.url?scp=84893346449&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893346449&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-02786-9_9
DO - 10.1007/978-3-319-02786-9_9
M3 - Conference contribution
AN - SCOPUS:84893346449
SN - 9783319027852
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 140
EP - 153
BT - Decision and Game Theory for Security - 4th International Conference, GameSec 2013, Proceedings
PB - Springer Verlag
Y2 - 11 November 2013 through 12 November 2013
ER -