Privacy preserving clustering

Somesh Jha, Luis Kruger, Patrick McDaniel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

127 Scopus citations

Abstract

The freedom and transparency of information flow on the Internet has heightened concerns of privacy. Given a set of data items, clustering algorithms group similar items together. Clustering has many applications, such as customer-behavior analysis, targeted marketing, forensics, and bioinformatics. In this paper, we present the design and analysis of a privacy-preserving k-means clustering algorithm, where only the cluster means at the various steps of the algorithm are revealed to the participating parties. The crucial step in our privacy-preserving k-means is privacy-preserving computation of cluster means. We present two protocols (one based on oblivious polynomial evaluation and the second based on homomorphic encryption) for privacy-preserving computation of cluster means. We have a JAVA implementation of our algorithm. Using our implementation, we have performed a thorough evaluation of our privacy-preserving clustering algorithm on three data sets. Our evaluation demonstrates that privacy-preserving clustering is feasible, i.e., our homomorphic-encryption based algorithm finished clustering a large data set in approximately 66 seconds.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages397-417
Number of pages21
DOIs
StatePublished - 2005
Event10th European Symposium on Research in Computer Security, ESORICS 2005 - Milan, Italy
Duration: Sep 12 2005Sep 14 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3679 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other10th European Symposium on Research in Computer Security, ESORICS 2005
Country/TerritoryItaly
CityMilan
Period9/12/059/14/05

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Privacy preserving clustering'. Together they form a unique fingerprint.

Cite this