TY - JOUR
T1 - Multiarmed Bandit Designs for Phase I Dose-Finding Clinical Trials With Multiple Toxicity Types
AU - Jin, Lan
AU - Pang, Guodong
AU - Alemayehu, Demissie
N1 - Publisher Copyright:
© 2021 American Statistical Association.
PY - 2023
Y1 - 2023
N2 - The goal of a phase I dose-finding trial is to determine the dose level of a new drug with acceptable toxicity. The optimal dose level is determined by sequentially allocating patients to increasing dose levels while monitoring any safety concerns. In practice, multiple toxicity types may be of interest and with varying degrees of importance of each toxicity type. To address this, scoring systems have been developed and conventional adaptive designs, such as the continual reassessment method (CRM), have accordingly been modified to handle them. In this article, we consider how to model the dose-finding problem under the multiarmed bandit framework, which naturally embeds the tradeoff between exploring the toxicity of dose levels and exploiting the current information to optimize benefit. We then propose a Bayesian multiarmed bandit design, dubbed quasi-likelihood optimistic bandit (QLOB), which has desirable operating characteristics, including allocation of patients to the dose level which has an estimated toxicity score closest to the target level and is relatively less explored. In extensive simulation studies, it is demonstrated that QLOB outperformed toxicity-score-based designs, such as quasi-CRM (QCRM), and general Bayesian optimal interval (gBOIN) in most scenarios considered; and performed much better than the conventional CRM and “3 + 3” designs with respect to dose recommendation and patient allocation. In addition, our design is shown to be robust against misspecification of the relevant hyper-parameter, and to have improved performance as the number of enrolled patients increases.
AB - The goal of a phase I dose-finding trial is to determine the dose level of a new drug with acceptable toxicity. The optimal dose level is determined by sequentially allocating patients to increasing dose levels while monitoring any safety concerns. In practice, multiple toxicity types may be of interest and with varying degrees of importance of each toxicity type. To address this, scoring systems have been developed and conventional adaptive designs, such as the continual reassessment method (CRM), have accordingly been modified to handle them. In this article, we consider how to model the dose-finding problem under the multiarmed bandit framework, which naturally embeds the tradeoff between exploring the toxicity of dose levels and exploiting the current information to optimize benefit. We then propose a Bayesian multiarmed bandit design, dubbed quasi-likelihood optimistic bandit (QLOB), which has desirable operating characteristics, including allocation of patients to the dose level which has an estimated toxicity score closest to the target level and is relatively less explored. In extensive simulation studies, it is demonstrated that QLOB outperformed toxicity-score-based designs, such as quasi-CRM (QCRM), and general Bayesian optimal interval (gBOIN) in most scenarios considered; and performed much better than the conventional CRM and “3 + 3” designs with respect to dose recommendation and patient allocation. In addition, our design is shown to be robust against misspecification of the relevant hyper-parameter, and to have improved performance as the number of enrolled patients increases.
UR - http://www.scopus.com/inward/record.url?scp=85113980240&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85113980240&partnerID=8YFLogxK
U2 - 10.1080/19466315.2021.1962402
DO - 10.1080/19466315.2021.1962402
M3 - Article
AN - SCOPUS:85113980240
SN - 1946-6315
VL - 15
SP - 164
EP - 177
JO - Statistics in Biopharmaceutical Research
JF - Statistics in Biopharmaceutical Research
IS - 1
ER -