TY - JOUR
T1 - Exact misclassification probabilities for plug-in normal quadratic discriminant functions. II. The heterogeneous case
AU - McFarland, H. Richard
AU - Richards, Donald St P.
N1 - Funding Information:
1Supported in part by a grant to the Institute for Advanced Study from The Bell Fund, and by the National Science Foundation under grant DMS-9703705.
PY - 2002
Y1 - 2002
N2 - We consider the problem of discriminating between two independent multivariate normal populations, Np(μ1, ∑1) and Np(μ2, ∑2), having distinct mean vectors μ1 and μ2 and distinct covariance matrices ∑1 and ∑2. The parameters μ1, μ2, μ1, and ∑2 are unknown and are estimated by means of independent random training samples from each population. We derive a stochastic representation for the exact distribution of the "plug-in" quadratic discriminant function for classifying a new observation between the two populations. The stochastic representation involves only the classical standard normal, chi-square, and F distributions and is easily implemented for simulation purposes. Using Monte Carlo simulation of the stochastic representation we provide applications to the estimation of misclassification probabilities for the well-known iris data studied by Fisher (Ann. Eugen. 7 (1936), 179-188); a data set on corporate financial ratios provided by Johnson and Wichern (Applied Multivariate Statistical Analysis, 4th ed., Prentice-Hall, Englewood Cliffs, NJ, 1998); and a data set analyzed by Reaven and Miller (Diabetologia 16 (1979), 17-24) in a classification of diabetic status.
AB - We consider the problem of discriminating between two independent multivariate normal populations, Np(μ1, ∑1) and Np(μ2, ∑2), having distinct mean vectors μ1 and μ2 and distinct covariance matrices ∑1 and ∑2. The parameters μ1, μ2, μ1, and ∑2 are unknown and are estimated by means of independent random training samples from each population. We derive a stochastic representation for the exact distribution of the "plug-in" quadratic discriminant function for classifying a new observation between the two populations. The stochastic representation involves only the classical standard normal, chi-square, and F distributions and is easily implemented for simulation purposes. Using Monte Carlo simulation of the stochastic representation we provide applications to the estimation of misclassification probabilities for the well-known iris data studied by Fisher (Ann. Eugen. 7 (1936), 179-188); a data set on corporate financial ratios provided by Johnson and Wichern (Applied Multivariate Statistical Analysis, 4th ed., Prentice-Hall, Englewood Cliffs, NJ, 1998); and a data set analyzed by Reaven and Miller (Diabetologia 16 (1979), 17-24) in a classification of diabetic status.
UR - http://www.scopus.com/inward/record.url?scp=0036050635&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036050635&partnerID=8YFLogxK
U2 - 10.1006/jmva.2001.2034
DO - 10.1006/jmva.2001.2034
M3 - Article
AN - SCOPUS:0036050635
SN - 0047-259X
VL - 82
SP - 299
EP - 330
JO - Journal of Multivariate Analysis
JF - Journal of Multivariate Analysis
IS - 2
M1 - 92034
ER -