TY - GEN
T1 - Learning in infinite-horizon inventory competition with total demand observations
AU - Zeinalzadeh, Ashkan
AU - Alptekinoglu, Aydin
AU - Arslan, Gurdal
PY - 2012
Y1 - 2012
N2 - We consider single-period and infinite-horizon inventory competition between two firms that replenish their inventories as in the well-known newsvendor model. Normally customers have a preference for shopping in one firm or the other. A fixed percentage of them who encounter a stockout in the firm of their first choice, though, visits the other firm. This substitution behavior makes the firm's replenishment decisions strategically related. Our main contribution is to introduce a simple learning algorithm to inventory competition. The learning algorithm requires each firm (a) to have the knowledge of its own critical fractile, which the firm can calculate using the values of its own per unit revenue, order cost, and holding cost; and (b) to observe its own total demand realizations. They do not necessarily know their true demand distributions. The firms need not even have any information about each other, beyond the implicit information encoded in their own total demand realizations affected by their competitors' inventory decisions. In fact, the firms need not even be aware that they are engaged in inventory competition. We prove that the inventory decisions generated by the learning algorithm converge, with probability one, to certain threshold values that constitute an equilibrium in pure Markov strategies for an infinite-horizon discounted-reward inventory competition game.
AB - We consider single-period and infinite-horizon inventory competition between two firms that replenish their inventories as in the well-known newsvendor model. Normally customers have a preference for shopping in one firm or the other. A fixed percentage of them who encounter a stockout in the firm of their first choice, though, visits the other firm. This substitution behavior makes the firm's replenishment decisions strategically related. Our main contribution is to introduce a simple learning algorithm to inventory competition. The learning algorithm requires each firm (a) to have the knowledge of its own critical fractile, which the firm can calculate using the values of its own per unit revenue, order cost, and holding cost; and (b) to observe its own total demand realizations. They do not necessarily know their true demand distributions. The firms need not even have any information about each other, beyond the implicit information encoded in their own total demand realizations affected by their competitors' inventory decisions. In fact, the firms need not even be aware that they are engaged in inventory competition. We prove that the inventory decisions generated by the learning algorithm converge, with probability one, to certain threshold values that constitute an equilibrium in pure Markov strategies for an infinite-horizon discounted-reward inventory competition game.
UR - http://www.scopus.com/inward/record.url?scp=84869433668&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84869433668&partnerID=8YFLogxK
U2 - 10.1109/acc.2012.6315678
DO - 10.1109/acc.2012.6315678
M3 - Conference contribution
AN - SCOPUS:84869433668
SN - 9781457710957
T3 - Proceedings of the American Control Conference
SP - 1382
EP - 1387
BT - 2012 American Control Conference, ACC 2012
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2012 American Control Conference, ACC 2012
Y2 - 27 June 2012 through 29 June 2012
ER -