TY - GEN
T1 - REDUS
T2 - 17th ACM Conference on Information and Knowledge Management, CIKM'08
AU - Zhang, Xiang
AU - Pan, Feng
AU - Wang, Wei
PY - 2008
Y1 - 2008
N2 - Finding latent patterns in high dimensional data is an important research problem with numerous applications. The most well known approaches for high dimensional data analysis are feature selection and dimensionality reduction. Being widely used in many applications, these methods aim to capture global patterns and are typically performed in the full feature space. In many emerging applications, however, scientists are interested in the local latent patterns held by feature subspaces, which may be invisible via any global transformation. In this paper, we investigate the problem of finding strong linear and nonlinear correlations hidden in feature subspaces of high dimensional data. We formalize this problem as identifying reducible subspaces in the full dimensional space. Intuitively, a reducible subspace is a feature subspace whose intrinsic dimensionality is smaller than the number of features. We present an efective algorithm, REDUS, for finding the reducible subspaces. Two key components of our algorithm are finding the overall reducible subspace, and uncovering the individual reducible subspaces from the overall reducible subspace. A broad experimental evaluation demonstrates the efectiveness of our algorithm.
AB - Finding latent patterns in high dimensional data is an important research problem with numerous applications. The most well known approaches for high dimensional data analysis are feature selection and dimensionality reduction. Being widely used in many applications, these methods aim to capture global patterns and are typically performed in the full feature space. In many emerging applications, however, scientists are interested in the local latent patterns held by feature subspaces, which may be invisible via any global transformation. In this paper, we investigate the problem of finding strong linear and nonlinear correlations hidden in feature subspaces of high dimensional data. We formalize this problem as identifying reducible subspaces in the full dimensional space. Intuitively, a reducible subspace is a feature subspace whose intrinsic dimensionality is smaller than the number of features. We present an efective algorithm, REDUS, for finding the reducible subspaces. Two key components of our algorithm are finding the overall reducible subspace, and uncovering the individual reducible subspaces from the overall reducible subspace. A broad experimental evaluation demonstrates the efectiveness of our algorithm.
UR - http://www.scopus.com/inward/record.url?scp=70349254659&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349254659&partnerID=8YFLogxK
U2 - 10.1145/1458082.1458209
DO - 10.1145/1458082.1458209
M3 - Conference contribution
AN - SCOPUS:70349254659
SN - 9781595939913
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 961
EP - 970
BT - Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM'08
Y2 - 26 October 2008 through 30 October 2008
ER -