It is often difficult to predict the optimal neural network size for a particular application. Constructive or destructive methods that add or subtract neurons, layers, connections, etc. might offer a solution to this problem. We prove that one method, recurrent cascade correlation, due to its topology, has fundamental limitations in representation and thus in its learning capabilities. It cannot represent with monotone (i.e., sigmoid) and hard-threshold activation functions certain finite state automata. We give a “preliminary" approach on how to get around these limitations by devising a simple constructive training method that adds neurons during training while still preserving the powerful fully-recurrent structure. We illustrate this approach by simulations which learn many examples of regular grammars that the recurrent cascade correlation method is unable to learn.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Computer Networks and Communications
- Artificial Intelligence