Skip to main navigation Skip to search Skip to main content

Deep Clustering for Mixed-type Data with Frequency Encoding and Doubly Weighted Cross Entropy Loss

  • Deogho Choi
  • , Daniel Chae
  • , Woo Yeon Kim
  • , Jihong Kim
  • , Janghoon Yang
  • , Jitae Shin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Clustering algorithm is unsupervised learning that groups a set of data into distinctive classes according to the similarity between each data sample. Most of previous researches have focused on improving K-prototypes or training proper numerical representations of categorical features using autoencoder. But in this research, we investigate that applying frequency encoding to categorical features can be sufficiently effective. Furthermore, we propose doubly weighted cross entropy loss, DW-CE loss, to find optimal cluster centroid by training fully connected layer. The experiment with two mixed-type datasets, credit approval and heart disease, from UCI repository shows that the proposed clustering with frequency encoding and DW-CE loss provides better performance than existing state of the arts methods in most of cases.

Original languageEnglish (US)
Title of host publicationITC-CSCC 2022 - 37th International Technical Conference on Circuits/Systems, Computers and Communications
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages141-144
Number of pages4
ISBN (Electronic)9781665485593
DOIs
StatePublished - 2022
Event37th International Technical Conference on Circuits/Systems, Computers and Communications, ITC-CSCC 2022 - Phuket, Thailand
Duration: Jul 5 2022Jul 8 2022

Publication series

NameITC-CSCC 2022 - 37th International Technical Conference on Circuits/Systems, Computers and Communications

Conference

Conference37th International Technical Conference on Circuits/Systems, Computers and Communications, ITC-CSCC 2022
Country/TerritoryThailand
CityPhuket
Period7/5/227/8/22

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Deep Clustering for Mixed-type Data with Frequency Encoding and Doubly Weighted Cross Entropy Loss'. Together they form a unique fingerprint.

Cite this