Privacy preserving k-means clustering with chaotic distortion

Jie Li, Yong Xu, Chao Hsien Chu, Yunfeng Wang

Research output: Contribution to journalConference articlepeer-review

Abstract

Randomized data distortion is a popular method used to mask the data for preserving the privacy. But the appropriateness of this method was questioned because of its possibility of disclosing original data. In this paper, the chaos system, with its unique characteristics of sensitivity on initial condition and unpredictability, is advocated to distort the original data with sensitive information for privacy preserving k-means clustering. The chaotic distortion procedure is proposed and three performance metrics specifically for k-means clustering are developed. We use a large scale experiment (with 4 real world data sets and corresponding reproduced 40 data sets) to evaluate its performance. Our study shows that the proposed approach is effective; it not only can protect individual privacy but also maintain original information of cluster centers.

Original languageEnglish (US)
Pages (from-to)61-67
Number of pages7
JournalProceedings of the International Conference on Electronic Business (ICEB)
StatePublished - 2007
Event7th International Conference on Electronic Business, ICEB 2007 - Taipei, Taiwan, Province of China
Duration: Dec 2 2007Dec 6 2007

All Science Journal Classification (ASJC) codes

  • Business, Management and Accounting(all)
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Privacy preserving k-means clustering with chaotic distortion'. Together they form a unique fingerprint.

Cite this