Synthesizing realistic medical images provides a feasible solution to the shortage of training data in deep learning based medical image recognition systems. However, the quality control of synthetic images for data augmentation purposes is under-investigated, and some of the generated images are not realistic and may contain misleading features that distort data distribution when mixed with real images. Thus, the effectiveness of those synthetic images in medical image recognition systems cannot be guaranteed when they are being added randomly without quality assurance. In this work, we propose a reinforcement learning (RL) based synthetic sample selection method that learns to choose synthetic images containing reliable and informative features. A transformer based controller is trained via proximal policy optimization (PPO) using the validation classification accuracy as the reward. The selected images are mixed with the original training data for improved training of image recognition systems. To validate our method, we take the pathology image recognition as an example and conduct extensive experiments on two histopathology image datasets. In experiments on a cervical dataset and a lymph node dataset, the image classification performance is improved by 8.1 % and 2.3 %, respectively, when utilizing high-quality synthetic images selected by our RL framework. Our proposed synthetic sample selection method is general and has great potential to boost the performance of various medical image recognition systems given limited annotation.

Original languageEnglish (US)
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2020 - 23rd International Conference, Proceedings
EditorsAnne L. Martel, Purang Abolmaesumi, Danail Stoyanov, Diana Mateus, Maria A. Zuluaga, S. Kevin Zhou, Daniel Racoceanu, Leo Joskowicz
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages11
ISBN (Print)9783030597092
StatePublished - 2020
Event23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020 - Lima, Peru
Duration: Oct 4 2020Oct 8 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12261 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Synthetic sample selection via reinforcement learning'. Together they form a unique fingerprint.

Cite this