BATCHMIXUP: Improving Training by Interpolating Hidden States of the Entire Mini-batch

Wenpeng Yin, Huan Wang, Jin Qu, Caiming Xiong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

Usually, we train a neural system on a sequence of mini-batches of labeled instances. Each mini-batch is composed of k samples, and each sample will learn a representation vector. MIXUP implicitly generates synthetic samples through linearly interpolating inputs and their corresponding labels of random sample pairs in the same mini-batch. This means that MIXUP only generates new points on the edges connecting every two original points in the representation space. We observed that the new points by the standard MIXUP cover pretty limited regions in the entire space of the mini-batch. In this work, we propose BATCHMIXUP-improving the model learning by interpolating hidden states of the entire mini-batch. BATCHMIXUP can generate new points scattered throughout the space corresponding to the mini-batch. In experiments, BATCHMIXUP shows superior performance than competitive baselines in improving the performance of NLP tasks while using different ratios of training data.

Original languageEnglish (US)
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationACL-IJCNLP 2021
EditorsChengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
PublisherAssociation for Computational Linguistics (ACL)
Pages4908-4912
Number of pages5
ISBN (Electronic)9781954085541
StatePublished - 2021
EventFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online
Duration: Aug 1 2021Aug 6 2021

Publication series

NameFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Conference

ConferenceFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021
CityVirtual, Online
Period8/1/218/6/21

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'BATCHMIXUP: Improving Training by Interpolating Hidden States of the Entire Mini-batch'. Together they form a unique fingerprint.

Cite this