Multivariate stream data classification using simple text classifiers

Sungbo Seo, Jaewoo Kang, Dongwon Lee, Keun Ho Ryu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.

Original languageEnglish (US)
Title of host publicationDatabase and Expert Systems Applications - 17th International Conference, DEXA 2006, Proceedings
PublisherSpringer Verlag
Pages420-429
Number of pages10
ISBN (Print)3540378715, 9783540378716
DOIs
StatePublished - 2006
Event17th International Conference on Database and Expert Systems Applications, DEXA 2006 - Krakow, Poland
Duration: Sep 4 2006Sep 8 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4080 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other17th International Conference on Database and Expert Systems Applications, DEXA 2006
Country/TerritoryPoland
CityKrakow
Period9/4/069/8/06

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multivariate stream data classification using simple text classifiers'. Together they form a unique fingerprint.

Cite this