Abstract
Methods for controlling the bias/variance tradeoff typically assume that overfitting or overtraining is a global phenomenon. For multi-layer perceptron (MLP) neural networks, global parameters such as the training time (e.g. based on validation tests), network size, or the amount of weight decay are commonly used to control the bias/variance tradeoff. However, the degree of overfitting can vary significantly throughout the input space of the model. We show that overselection of the degrees of freedom for an MLP trained with backpropagation can improve the approximation in regions of underfitting, while not significantly overfitting in other regions. This can be a significant advantage over other models. Furthermore, we show that `better' learning algorithms such as conjugate gradient can in fact lead to worse generalization, because they can be more prone to creating varying degrees of overfitting in different regions of the input space. While experimental results cannot cover all practical situations, our results do help to explain common behavior that does not agree with theoretical expectations. Our results suggest one important reason for the relative success of MLPs, bring into question common beliefs about neural network training regarding training algorithms, overfitting, and optimal network size, suggest alternate guidelines for practical use (in terms of the training algorithm and network size selection), and help to direct future work (e.g. regarding the importance of the MLP/BP training bias, the possibility of worse performance for `better' training algorithms, local `smoothness' criteria, and further investigation of localized overfitting).
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the International Joint Conference on Neural Networks |
Publisher | IEEE |
Pages | 114-119 |
Number of pages | 6 |
Volume | 1 |
State | Published - 2000 |
Event | International Joint Conference on Neural Networks (IJCNN'2000) - Como, Italy Duration: Jul 24 2000 → Jul 27 2000 |
Other
Other | International Joint Conference on Neural Networks (IJCNN'2000) |
---|---|
City | Como, Italy |
Period | 7/24/00 → 7/27/00 |
All Science Journal Classification (ASJC) codes
- Software