Abstract
We propose a data mining-constraint satisfaction optimization problem (DM-CSOP) where it is desired to maximize the number of correct classifications at a lowest possible information acquisition cost. We show that the problem can be formulated as a set of several binary variable knapsack optimization problems, which are solved sequentially. We propose a heuristic hybrid simulated annealing and gradient-descent artificial neural network (ANN) procedure to solve the DM-CSOP. Using a real-world heart disease data set, we show that the proposed hybrid procedure provides a low-cost and high-quality solution when compared to a traditional ANN classification approach. The massive proliferation of very large databases in organizations makes it necessary to design cost effective and efficient data mining systems. This paper proposes a data mining constraint satisfaction optimization problem, which provides a high quality cost effective solution for a binary classification problem.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 3124-3135 |
| Number of pages | 12 |
| Journal | Computers and Operations Research |
| Volume | 33 |
| Issue number | 11 |
| DOIs | |
| State | Published - Nov 2006 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- Modeling and Simulation
- Management Science and Operations Research
Fingerprint
Dive into the research topics of 'A data mining-constraint satisfaction optimization problem for cost effective classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver