Abstract
Deep learning based on U-Net has become the standard method for medical imaging segmentation. Since then, many improved U-Net architectures have been proposed to overcome the challenging nature of biomedical image datasets. MultiResUNet was developed to address the variation in scale in medical images. Despite its high accuracy, it is still based on a Convolutional Neural Network (CNN) which suffer from long-range dependencies. Moreover, Vision Transformers (ViTs) have achieved better accuracy owing to their ability to capture the global context. TransUNet was one of the first Transformer-CNN hybrid models for medical image segmentation that combined CNN and Transformers and achieved improved segmentation accuracy. Despite these developments, medical-imaging segmentation remains a challenging task. To this end, we propose an enhanced network architecture by fusing a Transformer with MultiResUNet to improve model segmentation accuracy by enhancing network architecture and model training procedure. The proposed method comprises an improved architecture with a deeper network and two Transformers. Furthermore, deep supervision using the decoder's side-output channel is introduced. Three datasets from different modalities were used to evaluate the proposed architecture: a CT scan and two ultrasound datasets. Training was conducted using the Sharpness-Aware Minimization (SAM) optimizer for the CT scan dataset owing to its robustness to noisy labels. The experimental results demonstrated that the proposed model consistently improved the prediction performance of the proposed architecture across different datasets.
| Original language | English (US) |
|---|---|
| Article number | 108056 |
| Journal | Biomedical Signal Processing and Control |
| Volume | 110 |
| DOIs | |
| State | Published - Dec 2025 |
All Science Journal Classification (ASJC) codes
- Signal Processing
- Biomedical Engineering
- Health Informatics
Fingerprint
Dive into the research topics of 'Hybrid MultiResUNet with transformers for medical image segmentation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver