TY - GEN
T1 - Extracting Learned Discard and Knocking Strategies from a Gin Rummy Bot
AU - Goldstein, Benjamin
AU - Guerra, Jean Pierre Astudillo
AU - Haigh, Emily
AU - Ulloa, Bryan Cruz
AU - Blum, Jeremy
N1 - Publisher Copyright:
Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved
PY - 2021
Y1 - 2021
N2 - Various Gin Rummy strategy guides provide heuristics for human players to improve their gameplay. Often these heuristics are either conflicting or contain ambiguity that limits their applicability, especially for discard and end-of-game decisions. This paper describes an approach to analyzing the machine learning capabilities of a Gin Rummy agent to help resolve these conflicts and ambiguities. There are three main decision points in the game: when to draw from the discard pile, which card to discard from the player's hand, and when to knock. The agent uses a learning approach to estimate the expected utility for discards. An analysis of these utility values provides insight into resolving ambiguities in tips for discard decisions in human play. The agent's end-of-game, or knocking, strategy was derived using Monte Carlo Counterfactual regret minimization (MCCFR). This approach was applied to estimate Nash equilibrium knocking strategies under different rules of the game. The analysis suggests that conflicts in the end-of-game playing tips are due in part to different rules used in common Gin Rummy variants.
AB - Various Gin Rummy strategy guides provide heuristics for human players to improve their gameplay. Often these heuristics are either conflicting or contain ambiguity that limits their applicability, especially for discard and end-of-game decisions. This paper describes an approach to analyzing the machine learning capabilities of a Gin Rummy agent to help resolve these conflicts and ambiguities. There are three main decision points in the game: when to draw from the discard pile, which card to discard from the player's hand, and when to knock. The agent uses a learning approach to estimate the expected utility for discards. An analysis of these utility values provides insight into resolving ambiguities in tips for discard decisions in human play. The agent's end-of-game, or knocking, strategy was derived using Monte Carlo Counterfactual regret minimization (MCCFR). This approach was applied to estimate Nash equilibrium knocking strategies under different rules of the game. The analysis suggests that conflicts in the end-of-game playing tips are due in part to different rules used in common Gin Rummy variants.
UR - http://www.scopus.com/inward/record.url?scp=85130012127&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85130012127&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85130012127
T3 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
SP - 15518
EP - 15525
BT - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
PB - Association for the Advancement of Artificial Intelligence
T2 - 35th AAAI Conference on Artificial Intelligence, AAAI 2021
Y2 - 2 February 2021 through 9 February 2021
ER -