Large-Scale Datastreams Surveillance via Pattern-Oriented-Sampling

Haojie Ren, Changliang Zou, Nan Chen, Runze Li

Research output: Contribution to journalArticlepeer-review

11 Scopus citations

Abstract

Monitoring large-scale datastreams with limited resources has become increasingly important for real-time detection of abnormal activities in many applications. Despite the availability of large datasets, the challenges associated with designing an efficient change-detection when clustering or spatial pattern exists are not yet well addressed. In this article, a design-adaptive testing procedure is developed when only a limited number of streaming observations can be accessed at each time. We derive an optimal sampling strategy, the pattern-oriented-sampling, with which the proposed test possesses asymptotically and locally best power under alternatives. Then, a sequential change-detection procedure is proposed by integrating this test with generalized likelihood ratio approach. Benefiting from dynamically estimating the optimal sampling design, the proposed procedure is able to improve the sensitivity in detecting clustered changes compared with existing procedures. Its advantages are demonstrated in numerical simulations and a real data example. Ignoring the neighboring information of spatially structured data will tend to diminish the detection effectiveness of traditional detection procedures. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)794-808
Number of pages15
JournalJournal of the American Statistical Association
Volume117
Issue number538
DOIs
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Large-Scale Datastreams Surveillance via Pattern-Oriented-Sampling'. Together they form a unique fingerprint.

Cite this