TY - GEN
T1 - Playscript classification and automatic wikipedia play articles generation
AU - Banerjee, Siddhartha
AU - Caragea, Cornelia
AU - Mitra, Prasenjit
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/4
Y1 - 2014/12/4
N2 - In this work, we aim to create Wikipedia pages on plays automatically by extracting relevant information from various web sources. Our approach involves building an efficient classifier that can classify web documents as play scripts. From the set of correctly classified instances of play scripts, we extract relevant play-related information from the documents and use it to obtain additional information from various sources on the web. This information is aggregated and human-readable Wikipedia pages are created using a bot. The results of our experiments show that classifiers trained by combining our designed features along with 'bag-of-words' (bow) features outperform classifiers trained using only bow features. Our approach further shows that good quality human-readable pages can be created using our bot. Such automatic page generation process can eventually ensure a more complete Wikipedia.
AB - In this work, we aim to create Wikipedia pages on plays automatically by extracting relevant information from various web sources. Our approach involves building an efficient classifier that can classify web documents as play scripts. From the set of correctly classified instances of play scripts, we extract relevant play-related information from the documents and use it to obtain additional information from various sources on the web. This information is aggregated and human-readable Wikipedia pages are created using a bot. The results of our experiments show that classifiers trained by combining our designed features along with 'bag-of-words' (bow) features outperform classifiers trained using only bow features. Our approach further shows that good quality human-readable pages can be created using our bot. Such automatic page generation process can eventually ensure a more complete Wikipedia.
UR - http://www.scopus.com/inward/record.url?scp=84919933198&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84919933198&partnerID=8YFLogxK
U2 - 10.1109/ICPR.2014.624
DO - 10.1109/ICPR.2014.624
M3 - Conference contribution
AN - SCOPUS:84919933198
T3 - Proceedings - International Conference on Pattern Recognition
SP - 3630
EP - 3635
BT - Proceedings - International Conference on Pattern Recognition
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd International Conference on Pattern Recognition, ICPR 2014
Y2 - 24 August 2014 through 28 August 2014
ER -