TY - JOUR
T1 - Computational linguistics for metadata building (CLiMB)
T2 - Using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata
AU - Klavans, Judith L.
AU - Sheffield, Carolyn
AU - Abels, Eileen
AU - Lin, Jimmy
AU - Passonneau, Rebecca
AU - Sidhu, Tandeep
AU - Soergel, Dagobert
N1 - Funding Information:
This project, funded by the Andrew W. Mellon Foundation, was initiated at the Center for Research on Information Access at Columbia University and is currently based at the University of Maryland. J.L.Klavans.C.Sheffield(*).J.Lin.T.Sidhu.D.Soergel iSchool, University of Maryland, College Park, MD, USA e-mail: [email protected]
Funding Information:
3One such project, T3: Text, Tagging and Trust to Improve Image Access for Museums and Libraries, has just been funded from the Institute for Museum and Library Science, imls.gov.
PY - 2009/3
Y1 - 2009/3
N2 - In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.
AB - In this paper, we present a system using computational linguistic techniques to extract metadata for image access. We discuss the implementation, functionality and evaluation of an image catalogers' toolkit, developed in the Computational Linguistics for Metadata Building (CLiMB) research project. We have tested components of the system, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text vis a vis a rich thesaurus of subject terms, geographic and artist names. We present specific results on disambiguation techniques and on the nature of the ambiguity problem given the thesaurus, resources, and domain-specific text resource, with a comparison of domain-general resources and text. Our primary user group for evaluation has been the cataloger expert with specific expertise in the fields of painting, sculpture, and vernacular and landscape architecture.
UR - http://www.scopus.com/inward/record.url?scp=59849109123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=59849109123&partnerID=8YFLogxK
U2 - 10.1007/s11042-008-0253-9
DO - 10.1007/s11042-008-0253-9
M3 - Article
AN - SCOPUS:59849109123
SN - 1380-7501
VL - 42
SP - 115
EP - 138
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 1
ER -