TY - GEN
T1 - DRN
T2 - 27th International World Wide Web, WWW 2018
AU - Zheng, Guanjie
AU - Zhang, Fuzheng
AU - Zheng, Zihan
AU - Xiang, Yang
AU - Yuan, Nicholas Jing
AU - Xie, Xing
AU - Li, Zhenhui
N1 - Funding Information:
The work was supported in part by NSF awards #1639150, #1544455, #1652525, and #1618448. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing any funding agencies.
Publisher Copyright:
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
PY - 2018/4/10
Y1 - 2018/4/10
N2 - In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation. Online personalized news recommendation is a highly challenging problem due to the dynamic nature of news features and user preferences. Although some online recommendation models have been proposed to address the dynamic nature of news recommendation, these methods have three major issues. First, they only try to model current reward (e.g., Click Through Rate). Second, very few studies consider to use user feedback other than click / no click labels (e.g., how frequent user returns) to help improve recommendation. Third, these methods tend to keep recommending similar news to users, which may cause users to get bored. Therefore, to address the aforementioned challenges, we propose a Deep Q-Learning based recommendation framework, which can model future reward explicitly. We further consider user return pattern as a supplement to click / no click label in order to capture more user feedback information. In addition, an effective exploration strategy is incorporated to find new attractive news for users. Extensive experiments are conducted on the offline dataset and online production environment of a commercial news recommendation application and have shown the superior performance of our methods.
AB - In this paper, we propose a novel Deep Reinforcement Learning framework for news recommendation. Online personalized news recommendation is a highly challenging problem due to the dynamic nature of news features and user preferences. Although some online recommendation models have been proposed to address the dynamic nature of news recommendation, these methods have three major issues. First, they only try to model current reward (e.g., Click Through Rate). Second, very few studies consider to use user feedback other than click / no click labels (e.g., how frequent user returns) to help improve recommendation. Third, these methods tend to keep recommending similar news to users, which may cause users to get bored. Therefore, to address the aforementioned challenges, we propose a Deep Q-Learning based recommendation framework, which can model future reward explicitly. We further consider user return pattern as a supplement to click / no click label in order to capture more user feedback information. In addition, an effective exploration strategy is incorporated to find new attractive news for users. Extensive experiments are conducted on the offline dataset and online production environment of a commercial news recommendation application and have shown the superior performance of our methods.
UR - http://www.scopus.com/inward/record.url?scp=85085176840&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085176840&partnerID=8YFLogxK
U2 - 10.1145/3178876.3185994
DO - 10.1145/3178876.3185994
M3 - Conference contribution
AN - SCOPUS:85085176840
T3 - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
SP - 167
EP - 176
BT - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery, Inc
Y2 - 23 April 2018 through 27 April 2018
ER -