Neural network classification and prior class probabilities

Steve Lawrence, Ian Burns, Andrew Back, Ah Chung Tsoi, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingChapter

21 Scopus citations

Abstract

A commonly encountered problem in MLP (multi-layer perceptron) classification problems is related to the prior probabilities of the individual classes - if the number of training examples that correspond to each class varies significantly between the classes, then it may be harder for the network to learn the rarer classes in some cases. Such practical experience does not match theoretical results which show that MLPs approximate Bayesian a posteriori probabilities (independent of the prior class probabilities). Our investigation of the problem shows that the difference between the theoretical and practical results lies with the assumptions made in the theory (accurate estimation of Bayesian a posteriori probabilities requires the network to be large enough, training to converge to a global minimum, infinite training data, and the a priori class probabilities of the test set to be correctly represented in the training set). Specifically, the problem can often be traced to the fact that efficient MLP training mechanisms lead to sub-optimal solutions for most practical problems. In this chapter, we demonstrate the problem, discuss possible methods for alleviating it, and introduce new heuristics which are shown to perform well on a sample ECG classification problem. The heuristics may also be used as a simple means of adjusting for unequal misclassification costs.

Original languageEnglish (US)
Title of host publicationNeural Networks
Subtitle of host publicationTricks of the Trade
PublisherSpringer Verlag
Pages295-309
Number of pages15
ISBN (Print)9783642352881
DOIs
StatePublished - 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7700 LECTURE NO
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Neural network classification and prior class probabilities'. Together they form a unique fingerprint.

Cite this