A performance analysis of dimensionality reduction algorithms in machine learning models for cancer prediction

Research output: Contribution to journalArticlepeer-review

Abstract

Developments in technology facilitate the use of machine learning methods in medical fields. In cancer research, the combination of machine learning tools and gene expression data has proven its ability to detect cancer patients. However, processing such high-dimensional and complex data is still a challenge. This paper analyzed the impact different dimensionality reduction techniques have on machine learning models used for cancer prediction. Dimensionality reduction techniques such as principal component analysis (PCA), PCA with a kernel, and autoencoder were utilized to reduce the dimensionality of the RNA sequencing data. Two machine learning classifiers, namely neural network and support vector machine, were trained and tested using the original, dimensionally reduced, and cancer-relevant data. Various metrics, such as accuracy, precision, recall, F-Measure, receiver operating characteristic curve, and area under the curve, were used to assess the performance of classifiers. The results showed that dimensionality reduction positively affects the performance of the classifiers. Additionally, autoencoder performed better than PCA and PCA with a kernal. These findings indicate the potential of dimensionality reduction in improving the analytical results of machine learning classification models on high-dimensional data.

Original languageEnglish (US)
Article number100125
JournalHealthcare Analytics
Volume3
DOIs
StatePublished - Nov 2023

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

All Science Journal Classification (ASJC) codes

  • Analytical Chemistry
  • Health Informatics

Fingerprint

Dive into the research topics of 'A performance analysis of dimensionality reduction algorithms in machine learning models for cancer prediction'. Together they form a unique fingerprint.

Cite this