TY - GEN
T1 - A self-organizing method for predictive modeling with highly-redundant variables
AU - Liu, Gang
AU - Yang, Hui
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/10/7
Y1 - 2015/10/7
N2 - Rapid advancement of sensing and information technology brings the big data, which presents a gold mine of the 21st century. However, big data also brings significant challenges for data-driven decision making. In particular, it is not uncommon that a large number of variables (or features) underlie the big data. Complex interdependence structures among variables challenge the traditional framework of predictive modeling. This paper presents a new methodology of self-organizing network for variable clustering and predictive modeling. Specifically, we developed a new approach, namely nonlinear coupling analysis to measure nonlinear interdependence structures among variables. Further, all the variables are embedded as nodes in a complex network. Nonlinear-coupling forces move these nodes to derive a self-organizing topology of network. As such, variables are clustered as sub-network communities in the space. Experimental results demonstrated that the proposed methodology not only outperforms traditional variable clustering algorithms such as hierarchical clustering and oblique principal component analysis, but also effectively identify interdependent structures among variables and further improves the performance of predictive modeling. The proposed new idea of self-organizing network is generally applicable for predictive modeling in many disciplines that involve a large number of highly-redundant variables.
AB - Rapid advancement of sensing and information technology brings the big data, which presents a gold mine of the 21st century. However, big data also brings significant challenges for data-driven decision making. In particular, it is not uncommon that a large number of variables (or features) underlie the big data. Complex interdependence structures among variables challenge the traditional framework of predictive modeling. This paper presents a new methodology of self-organizing network for variable clustering and predictive modeling. Specifically, we developed a new approach, namely nonlinear coupling analysis to measure nonlinear interdependence structures among variables. Further, all the variables are embedded as nodes in a complex network. Nonlinear-coupling forces move these nodes to derive a self-organizing topology of network. As such, variables are clustered as sub-network communities in the space. Experimental results demonstrated that the proposed methodology not only outperforms traditional variable clustering algorithms such as hierarchical clustering and oblique principal component analysis, but also effectively identify interdependent structures among variables and further improves the performance of predictive modeling. The proposed new idea of self-organizing network is generally applicable for predictive modeling in many disciplines that involve a large number of highly-redundant variables.
UR - http://www.scopus.com/inward/record.url?scp=84952762169&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84952762169&partnerID=8YFLogxK
U2 - 10.1109/CoASE.2015.7294243
DO - 10.1109/CoASE.2015.7294243
M3 - Conference contribution
AN - SCOPUS:84952762169
T3 - IEEE International Conference on Automation Science and Engineering
SP - 1084
EP - 1089
BT - 2015 IEEE Conference on Automation Science and Engineering
PB - IEEE Computer Society
T2 - 11th IEEE International Conference on Automation Science and Engineering, CASE 2015
Y2 - 24 August 2015 through 28 August 2015
ER -