TY - GEN
T1 - Functional semantic categories for art history text - Human labeling and preliminary machine learning
AU - Passonneau, Rebecca J.
AU - Yano, Tae
AU - Lippincott, Tom
AU - Klavans, Judith
PY - 2008
Y1 - 2008
N2 - The CLiMB project investigates semi-automatic methods to extract descriptive metadata from texts for indexing digital image collections. We developed a set of functional semantic categories to classify text extracts that describe images. Each semantic category names a functional relation between an image depicting a work of art historical significance, and expository text associated with the image. This includes description of the image, discussion of the historical context in which the work was created, and so on. We present interannotator agreement results on human classification of text extracts, and accuracy results from initial machine learning experiments. In our pilot studies, human agreement varied widely, depending the labeler's expertise, the image-text pair under consideration, the number of labels that could be assigned to one text, and the type of training, if any, we gave labelers. Initial machine learning results indicate the three most relevant categories are machine learnable. Based on our pilot work, we implemented a labeling interface that we are currently using to collect a large dataset of text that will be used in training and testing machine classifiers.
AB - The CLiMB project investigates semi-automatic methods to extract descriptive metadata from texts for indexing digital image collections. We developed a set of functional semantic categories to classify text extracts that describe images. Each semantic category names a functional relation between an image depicting a work of art historical significance, and expository text associated with the image. This includes description of the image, discussion of the historical context in which the work was created, and so on. We present interannotator agreement results on human classification of text extracts, and accuracy results from initial machine learning experiments. In our pilot studies, human agreement varied widely, depending the labeler's expertise, the image-text pair under consideration, the number of labels that could be assigned to one text, and the type of training, if any, we gave labelers. Initial machine learning results indicate the three most relevant categories are machine learnable. Based on our pilot work, we implemented a labeling interface that we are currently using to collect a large dataset of text that will be used in training and testing machine classifiers.
UR - http://www.scopus.com/inward/record.url?scp=57349131009&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=57349131009&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:57349131009
SN - 9789898111241
T3 - Proceedings of the 1st International Workshop on Metadata Mining for Image Understanding, MMIU 2008 - In Conjunction with VISIGRAPP 2008
SP - 13
EP - 22
BT - Proceedings of the 1st International Workshop on Metadata Mining for Image Understanding, MMIU 2008 - In Conjunction with VISIGRAPP 2008
T2 - 1st International Workshop on Metadata Mining for Image Understanding, MMIU 2008 - In Conjunction with VISIGRAPP 2008
Y2 - 22 January 2008 through 25 January 2008
ER -