Fast-RCM: Fast Tree-Based Unsupervised Rare-Class Mining

Haiqin Weng, Shouling Ji, Changchang Liu, Ting Wang, Qinming He, Jianhai Chen

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Rare classes are usually hidden in an imbalanced dataset with the majority of the data examples from major classes. Rare-class mining (RCM) aims at extracting all the data examples belonging to rare classes. Most of the existing approaches for RCM require a certain amount of labeled data examples as input. However, they are ineffective in practice since requesting label information from domain experts is time consuming and human-labor extensive. Thus, we investigate the unsupervised RCM problem, which to the best of our knowledge is the first such attempt. To this end, we propose an efficient algorithm called Fast-RCM for unsupervised RCM, which has an approximately linear time complexity with respect to data size and data dimensionality. Given an unlabeled dataset, Fast-RCM mines out the rare class by first building a rare tree for the input dataset and then extracting data examples of the rare classes based on this rare tree. Compared with the existing approaches which have quadric or even cubic time complexity, Fast-RCM is much faster and can be extended to large-scale datasets. The experimental evaluation on both synthetic and real-world datasets demonstrate that our algorithm can effectively and efficiently extract the rare classes from an unlabeled dataset under the unsupervised settings, and is approximately five times faster than that of the state-of-the-art methods.

Original languageEnglish (US)
Pages (from-to)5198-5211
Number of pages14
JournalIEEE Transactions on Cybernetics
Issue number10
StatePublished - Oct 1 2021

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Fast-RCM: Fast Tree-Based Unsupervised Rare-Class Mining'. Together they form a unique fingerprint.

Cite this