TY - GEN
T1 - Alphabet size selection for symbolization of dynamic data-driven systems
T2 - 2015 American Control Conference, ACC 2015
AU - Sarkar, Soumalya
AU - Chattopdhyay, P.
AU - Ray, Asok
AU - Phoha, Shashi
AU - Levi, Mark
N1 - Publisher Copyright:
© 2015 American Automatic Control Council.
PY - 2015/7/28
Y1 - 2015/7/28
N2 - Symbolic time series analysis (STSA) is built upon the concept of symbolic dynamics that deals with discretization of dynamical systems in both space and time. The notion of STSA has led to the development of a pattern recognition tool in the paradigm of dynamic data-driven application systems (DDDAS), where a time series of sensor signals is partitioned to obtain a symbol sequence that, in turn, leads to the construction of probabilistic finite state automata (PFSA). Although modeling of PFSA from symbol sequences has been widely reported, similar efforts have not been expended to investigate how to find an appropriate alphabet size for partitioning of time series so that the symbol sequences can be optimally generated. This paper addresses this critical issue and proposes an information-theoretic procedure of data partitioning to extract low-dimensional features from time series. The key idea lies in optimal partitioning of the time series via maximization of the mutual information between the input state probability vector and pattern classes. The proposed procedure has been validated by two examples. The first example elucidates the underlying concept of data partitioning for parameter identification in a Duffing system with a sinusoidal input excitation. The second example is built upon time series of chemiluminescence data to predict lean blow-out (LBO) phenomena in a laboratory-scale combustor. Classification performance of data partitioning is analyzed in each of the two examples.
AB - Symbolic time series analysis (STSA) is built upon the concept of symbolic dynamics that deals with discretization of dynamical systems in both space and time. The notion of STSA has led to the development of a pattern recognition tool in the paradigm of dynamic data-driven application systems (DDDAS), where a time series of sensor signals is partitioned to obtain a symbol sequence that, in turn, leads to the construction of probabilistic finite state automata (PFSA). Although modeling of PFSA from symbol sequences has been widely reported, similar efforts have not been expended to investigate how to find an appropriate alphabet size for partitioning of time series so that the symbol sequences can be optimally generated. This paper addresses this critical issue and proposes an information-theoretic procedure of data partitioning to extract low-dimensional features from time series. The key idea lies in optimal partitioning of the time series via maximization of the mutual information between the input state probability vector and pattern classes. The proposed procedure has been validated by two examples. The first example elucidates the underlying concept of data partitioning for parameter identification in a Duffing system with a sinusoidal input excitation. The second example is built upon time series of chemiluminescence data to predict lean blow-out (LBO) phenomena in a laboratory-scale combustor. Classification performance of data partitioning is analyzed in each of the two examples.
UR - http://www.scopus.com/inward/record.url?scp=84940936378&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84940936378&partnerID=8YFLogxK
U2 - 10.1109/ACC.2015.7172150
DO - 10.1109/ACC.2015.7172150
M3 - Conference contribution
AN - SCOPUS:84940936378
T3 - Proceedings of the American Control Conference
SP - 5194
EP - 5199
BT - ACC 2015 - 2015 American Control Conference
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 July 2015 through 3 July 2015
ER -