Shot Optimization in Quantum Machine Learning Architectures to Accelerate Training

Koustubh Phalak, Swaroop Ghosh

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Quantum Machine Learning (QML) has recently emerged as a rapidly growing domain as an intersection of Quantum Computing (QC) and Machine Learning (ML) fields. Hybrid quantum-classical models have demonstrated exponential speedups in various machine learning tasks compared to their classical counterparts. On one hand, training of QML models on real hardware remains a challenge due to long wait queue and the access cost. On the other hand, simulation-based training is not scalable to large QML models due to exponentially growing simulation time. Since the measurement operation converts quantum information to classical binary data, the quantum circuit is executed multiple times (called shots) to obtain the basis state probabilities or qubit expectation values. Higher number of shots worsen the training time of QML models on real hardware and the access cost. Higher number of shots also increase the simulation-based training time. In this paper, we propose shot optimization method for QML models at the expense of minimal impact on model performance. We use classification task as a test case for MNIST and FMNIST datasets using a hybrid quantum-classical QML model. First, we sweep the number of shots for short and full versions of the dataset. We observe that training the full version provides 5-6% higher testing accuracy than short version of dataset with up to 10X higher number of shots for training. Therefore, one can reduce the dataset size to accelerate the training time. Next, we propose adaptive shot allocation on short version dataset to optimize the number of shots over training epochs and evaluate the impact on classification accuracy. We use a (a) linear function where the number of shots reduce linearly with epochs, and (b) step function where the number of shots reduce in step with epochs. We note around 0.01 increase in loss and maximum ∼4% (1%) reduction in testing accuracy for reduction in shots by up to 100X (10X) for linear (step) shot function compared to conventional constant shot function for MNIST dataset, and 0.05 increase in loss and ∼5-7% (5-7%) reduction in testing accuracy with similar reduction in shots using linear (step) shot function on FMNIST dataset. For comparison, we also use the proposed shot optimization methods to perform ground state energy estimation of different molecules and observe that step function gives the best and most stable ground state energy prediction at 1000X less number of shots.

Original languageEnglish (US)
Pages (from-to)41514-41523
Number of pages10
JournalIEEE Access
Volume11
DOIs
StatePublished - 2023

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Materials Science
  • General Engineering

Cite this