Recurrent neural networks are a natural model for learning and predicting temporal signals. In addition, simple recurrent networks have been shown to be both theoretically and experimentally capable of learning finite state automata [Cleeremans 89. Giles 92a, Minsky 67, Pollack 91, Siegelmann 92]. However, it is difficult to determine what is the minimal neural network structure for a particular automaton. Using a large recurrent network, which would be versatile in theory, in practice proves to be very difficult to train. Constructive or destructive recurrent methods might offer a solution to this problem. We prove that one current method. Recurrent Cascade Correlation, has fundamental limitations in representation and thus in its learning capabilities. We give a preliminary approach on how to get around these limitations by devising a "simple" constructive training method that adds neurons during training while still preserving the powerful fully recurrent structure. Through simulations we show that such a method can learn many types of regular grammars that the Recurrent Cascade Correlation method is unable to learn.