TY - JOUR
T1 - Real-time computerized annotation of pictures
AU - Li, Jia
AU - Wang, James Z.
N1 - Funding Information:
The research is supported in part by the US National Science Foundation under Grant 0705210, 0219272, and 0202007. The authors thank Diane Flowers for providing manual evaluation on annotation results, Dhiraj Joshi for designing a Web-based manual evaluation system and for incorporating image data from collaborators and other Web sites, Walter Weiss for developing the initial keyword-based search and for maintaining many of our systems, David M. Pennock of Yahoo! for providing test images, Hongyuan Zha for useful discussions on optimization, and Takeo Kanade for encouragements. The authors would also like to acknowledge the comments and constructive suggestions from reviewers. J. Li developed the D2-clustering and the generalized mixture modeling algorithms. Both authors contributed to the design of the ALIPR system and conducted the experimental studies. An online demonstration of the work is provided at http://alipr.com. More information about the research: http://riemann.ist.psu.edu.
PY - 2008/6
Y1 - 2008/6
N2 - Developing effective methods for automated annotation of digital pictures continues to challenge computer scientists. The capability of annotating pictures by computers can lead to breakthroughs in a wide range of applications, including Web image search, online picture-sharing communities, and scientific experiments. In this work, the authors developed new optimization and estimation techniques to address two fundamental problems in machine learning. These new techniques serve as the basis for the Automatic Linguistic Indexing of Pictures - Real Time (ALIPR) system of fully automatic and high speed annotation for online pictures. In particular, the D2-clustering method, in the same spirit as k-means for vectors, is developed to group objects represented by bags of weighted vectors. Moreover, a generalized mixture modeling technique (kernel smoothing as a special case) for non-vector data is developed using the novel concept of Hypothetical Local Mapping (HLM). ALIPR has been tested by thousands of pictures from an Internet photo-sharing site, unrelated to the source of those pictures used in the training process. Its performance has also been studied at an online demo site where arbitrary users provide pictures of their choices and indicate the correctness of each annotation word. The experimental results show that a single computer processor can suggest annotation terms in real-time and with good accuracy.
AB - Developing effective methods for automated annotation of digital pictures continues to challenge computer scientists. The capability of annotating pictures by computers can lead to breakthroughs in a wide range of applications, including Web image search, online picture-sharing communities, and scientific experiments. In this work, the authors developed new optimization and estimation techniques to address two fundamental problems in machine learning. These new techniques serve as the basis for the Automatic Linguistic Indexing of Pictures - Real Time (ALIPR) system of fully automatic and high speed annotation for online pictures. In particular, the D2-clustering method, in the same spirit as k-means for vectors, is developed to group objects represented by bags of weighted vectors. Moreover, a generalized mixture modeling technique (kernel smoothing as a special case) for non-vector data is developed using the novel concept of Hypothetical Local Mapping (HLM). ALIPR has been tested by thousands of pictures from an Internet photo-sharing site, unrelated to the source of those pictures used in the training process. Its performance has also been studied at an online demo site where arbitrary users provide pictures of their choices and indicate the correctness of each annotation word. The experimental results show that a single computer processor can suggest annotation terms in real-time and with good accuracy.
UR - http://www.scopus.com/inward/record.url?scp=43249117136&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=43249117136&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2007.70847
DO - 10.1109/TPAMI.2007.70847
M3 - Article
C2 - 18421105
AN - SCOPUS:43249117136
SN - 0162-8828
VL - 30
SP - 985
EP - 1002
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 6
ER -