TY - GEN
T1 - Joint image and text representation for aesthetics analysis
AU - Zhou, Ye
AU - Lu, Xin
AU - Zhang, Junping
AU - Wang, James Z.
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/10/1
Y1 - 2016/10/1
N2 - Image aesthetics assessment is essential to multimedia applications such as image retrieval, and personalized image search and recommendation. Primarily relying on visual information and manually-supplied ratings, previous studies in this area have not adequately utilized higher-level semantic information. We incorporate additional textual phrases from user comments to jointly represent image aesthetics utilizing multimodal Deep Boltzmann Machine. Given an image, without requiring any associated user comments, the proposed algorithm automatically infers the joint representation and predicts the aesthetics category of the image. We construct the AVA-Comments dataset to systematically evaluate the performance of the proposed algorithm. Experimental results indicate that the proposed joint representation improves the performance of aesthetics assessment on the benchmarking AVA dataset, comparing with only visual features.
AB - Image aesthetics assessment is essential to multimedia applications such as image retrieval, and personalized image search and recommendation. Primarily relying on visual information and manually-supplied ratings, previous studies in this area have not adequately utilized higher-level semantic information. We incorporate additional textual phrases from user comments to jointly represent image aesthetics utilizing multimodal Deep Boltzmann Machine. Given an image, without requiring any associated user comments, the proposed algorithm automatically infers the joint representation and predicts the aesthetics category of the image. We construct the AVA-Comments dataset to systematically evaluate the performance of the proposed algorithm. Experimental results indicate that the proposed joint representation improves the performance of aesthetics assessment on the benchmarking AVA dataset, comparing with only visual features.
UR - http://www.scopus.com/inward/record.url?scp=84994588706&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994588706&partnerID=8YFLogxK
U2 - 10.1145/2964284.2967223
DO - 10.1145/2964284.2967223
M3 - Conference contribution
AN - SCOPUS:84994588706
T3 - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
SP - 262
EP - 266
BT - MM 2016 - Proceedings of the 2016 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
T2 - 24th ACM Multimedia Conference, MM 2016
Y2 - 15 October 2016 through 19 October 2016
ER -