Variance reduction in gradient exploration for online learning to rank

Huazheng Wang, Sonwoo Kim, Eric McCord-Snook, Qingyun Wu, Hongning Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Scopus citations

Abstract

Online Learning to Rank (OL2R) algorithms learn from implicit user feedback on the fly. The key to such algorithms is an unbiased estimate of gradients, which is often (trivially) achieved by uniformly sampling from the entire parameter space. Unfortunately, this leads to high-variance in gradient estimation, resulting in high regret during model updates, especially when the dimension of the parameter space is large. In this work, we aim at reducing the variance of gradient estimation in OL2R algorithms. We project the selected updating direction (i.e., the winning direction) into a space spanned by the feature vectors from examined documents under the current query (termed the “document space” for short), after an interleaved test. Our key insight is that the result of an interleaved test is solely governed by a user's relevance evaluation over the examined documents. Hence, the true gradient introduced by this test is only reflected in the constructed document space, and components of the proposed gradient which are orthogonal to the document space can be safely removed, for variance reduction purpose. We prove that this projected gradient is still an unbiased estimation of the true gradient, and show that this lower-variance gradient estimation results in significant regret reduction. Our proposed method is compatible with all existing OL2R algorithms which rank documents using a linear model. Extensive experimental comparisons with several state-of-the-art OL2R algorithms have confirmed the effectiveness of our proposed method in reducing the variance of gradient estimation and improving overall ranking performance.

Original languageEnglish (US)
Title of host publicationSIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages835-844
Number of pages10
ISBN (Electronic)9781450361729
DOIs
StatePublished - Jul 18 2019
Event42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019 - Paris, France
Duration: Jul 21 2019Jul 25 2019

Publication series

NameSIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Conference

Conference42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019
Country/TerritoryFrance
CityParis
Period7/21/197/25/19

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Applied Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Variance reduction in gradient exploration for online learning to rank'. Together they form a unique fingerprint.

Cite this