TY - GEN
T1 - Multivariate stream data classification using simple text classifiers
AU - Seo, Sungbo
AU - Kang, Jaewoo
AU - Lee, Dongwon
AU - Ryu, Keun Ho
PY - 2006
Y1 - 2006
N2 - We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.
AB - We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.
UR - http://www.scopus.com/inward/record.url?scp=33749408767&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749408767&partnerID=8YFLogxK
U2 - 10.1007/11827405_41
DO - 10.1007/11827405_41
M3 - Conference contribution
AN - SCOPUS:33749408767
SN - 3540378715
SN - 9783540378716
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 420
EP - 429
BT - Database and Expert Systems Applications - 17th International Conference, DEXA 2006, Proceedings
PB - Springer Verlag
T2 - 17th International Conference on Database and Expert Systems Applications, DEXA 2006
Y2 - 4 September 2006 through 8 September 2006
ER -