TY - GEN
T1 - Deep multi-patch aggregation network for image style, aesthetics, and quality estimation
AU - Lu, Xin
AU - Lin, Zhe
AU - Shen, Xiaohui
AU - Mech, Radomir
AU - Wang, James
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/2/17
Y1 - 2015/2/17
N2 - This paper investigates problems of image style, aesthetics, and quality estimation, which require fine-grained details from high-resolution images, utilizing deep neural network training approach. Existing deep convolutional neural networks mostly extracted one patch such as a down-sized crop from each image as a training example. However, one patch may not always well represent the entire image, which may cause ambiguity during training. We propose a deep multi-patch aggregation network training approach, which allows us to train models using multiple patches generated from one image. We achieve this by constructing multiple, shared columns in the neural network and feeding multiple patches to each of the columns. More importantly, we propose two novel network layers (statistics and sorting) to support aggregation of those patches. The proposed deep multi-patch aggregation network integrates shared feature learning and aggregation function learning into a unified framework. We demonstrate the effectiveness of the deep multi-patch aggregation network on the three problems, i.e., image style recognition, aesthetic quality categorization, and image quality estimation. Our models trained using the proposed networks significantly outperformed the state of the art in all three applications.
AB - This paper investigates problems of image style, aesthetics, and quality estimation, which require fine-grained details from high-resolution images, utilizing deep neural network training approach. Existing deep convolutional neural networks mostly extracted one patch such as a down-sized crop from each image as a training example. However, one patch may not always well represent the entire image, which may cause ambiguity during training. We propose a deep multi-patch aggregation network training approach, which allows us to train models using multiple patches generated from one image. We achieve this by constructing multiple, shared columns in the neural network and feeding multiple patches to each of the columns. More importantly, we propose two novel network layers (statistics and sorting) to support aggregation of those patches. The proposed deep multi-patch aggregation network integrates shared feature learning and aggregation function learning into a unified framework. We demonstrate the effectiveness of the deep multi-patch aggregation network on the three problems, i.e., image style recognition, aesthetic quality categorization, and image quality estimation. Our models trained using the proposed networks significantly outperformed the state of the art in all three applications.
UR - https://www.scopus.com/pages/publications/84973904851
UR - https://www.scopus.com/pages/publications/84973904851#tab=citedBy
U2 - 10.1109/ICCV.2015.119
DO - 10.1109/ICCV.2015.119
M3 - Conference contribution
AN - SCOPUS:84973904851
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 990
EP - 998
BT - 2015 International Conference on Computer Vision, ICCV 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Conference on Computer Vision, ICCV 2015
Y2 - 11 December 2015 through 18 December 2015
ER -