TY - GEN
T1 - Empirical comparison of flat-spot elimination techniques in back-propagation networks
AU - Parekh, Rajesh
AU - Balakrishnan, Karthik
AU - Honavar, Vasant
PY - 1993
Y1 - 1993
N2 - Back-Propagation (BP)[Rumelhart et al, 1986] is a popular algorithm employed for training multilayer connectionist learning systems with nonlinear activation function (sigmoid). However, BP is plagued by excruciatingly slow convergence for many applications, and this drawback has been partly attributed to the Flat-Spots problem. Literature defines flat-spots as regions where the derivative of the sigmoid activation function approaches zero, and in these regions the weight changes become negligible, despite the presence of considerable classification error. Thus learning slows down dramatically. Several researchers have addressed this problem posed by flat-spots [Fahlman, 1988, Balakrishnan & Honavar, 1992]. In this paper we present a new way of dealing with the flat-spots in the output layer. This new method uses a Perceptron-like weight-modification strategy to complement BP in the output layer. We also report an empirical evaluation of the comparative performances of these techniques on some data-sets that have been used extensively in bench-marking inductive learning algorithms.
AB - Back-Propagation (BP)[Rumelhart et al, 1986] is a popular algorithm employed for training multilayer connectionist learning systems with nonlinear activation function (sigmoid). However, BP is plagued by excruciatingly slow convergence for many applications, and this drawback has been partly attributed to the Flat-Spots problem. Literature defines flat-spots as regions where the derivative of the sigmoid activation function approaches zero, and in these regions the weight changes become negligible, despite the presence of considerable classification error. Thus learning slows down dramatically. Several researchers have addressed this problem posed by flat-spots [Fahlman, 1988, Balakrishnan & Honavar, 1992]. In this paper we present a new way of dealing with the flat-spots in the output layer. This new method uses a Perceptron-like weight-modification strategy to complement BP in the output layer. We also report an empirical evaluation of the comparative performances of these techniques on some data-sets that have been used extensively in bench-marking inductive learning algorithms.
UR - http://www.scopus.com/inward/record.url?scp=0027878882&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0027878882&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:0027878882
SN - 1565550072
T3 - Proceedings of SPIE - The International Society for Optical Engineering
SP - 55
EP - 60
BT - Proceedings of SPIE - The International Society for Optical Engineering
A2 - Padgett, Marry Lou
PB - Publ by Society of Photo-Optical Instrumentation Engineers
T2 - Proceedings of the 3rd Workshop on Neural Networks: Academic/Industrial/NASA/Defense
Y2 - 10 February 1992 through 12 February 1992
ER -