Beyond backprop: Online alternating minimization with auxiliary variables

  • Anna Choromanska
  • , Benjamin Cowen
  • , Sadhana Kumaravel
  • , Ronny Luss
  • , Mattia Rigotti
  • , Irina Rish
  • , Brian Kingsbury
  • , Paolo DiAchille
  • , Viatcheslav Gurev
  • , Ravi Tejwani
  • , Djallel Bouneffouf

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    6 Scopus citations

    Abstract

    Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together with the first theoretical convergence guarantees for AM in stochastic settings and promising empirical results on a variety of architectures and datasets.

    Original languageEnglish (US)
    Title of host publication36th International Conference on Machine Learning, ICML 2019
    PublisherInternational Machine Learning Society (IMLS)
    Pages2041-2050
    Number of pages10
    ISBN (Electronic)9781510886988
    StatePublished - 2019
    Event36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States
    Duration: Jun 9 2019Jun 15 2019

    Publication series

    Name36th International Conference on Machine Learning, ICML 2019
    Volume2019-June

    Conference

    Conference36th International Conference on Machine Learning, ICML 2019
    Country/TerritoryUnited States
    CityLong Beach
    Period6/9/196/15/19

    All Science Journal Classification (ASJC) codes

    • Education
    • Computer Science Applications
    • Human-Computer Interaction

    Fingerprint

    Dive into the research topics of 'Beyond backprop: Online alternating minimization with auxiliary variables'. Together they form a unique fingerprint.

    Cite this