Learning long-term dependencies is not as difficult with NARX networks

Tsungnan Lin, Bill G. Home, Peter Tiňo, C. Lee Giles

Research output: Contribution to conferencePaperpeer-review

9 Scopus citations

Abstract

It has recently been shown that gradient descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long-term dependencies. In this paper we explore this problem for a class of architectures called NARX networks, which have powerful representational capabilities. Previous work reported that gradient descent learning is more effective in NARX networks than in recurrent networks with "hidden states". We show that although NARX networks do not circumvent the problem of long-term dependencies, they can greatly improve performance on such problems. We present some experimental results that show that NARX networks can often retain information for two to three times as long as conventional recurrent networks.

Original languageEnglish (US)
Pages577-583
Number of pages7
StatePublished - 1995
Event8th International Conference on Neural Information Processing Systems, NIPS 1995 - Denver, United States
Duration: Nov 27 1995Dec 2 1995

Conference

Conference8th International Conference on Neural Information Processing Systems, NIPS 1995
Country/TerritoryUnited States
CityDenver
Period11/27/9512/2/95

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Networks and Communications
  • Signal Processing

Fingerprint

Dive into the research topics of 'Learning long-term dependencies is not as difficult with NARX networks'. Together they form a unique fingerprint.

Cite this