TY - GEN
T1 - MTMLD-AWSR
T2 - 2025 IEEE Cloud Summit, Cloud-Summit 2025
AU - Nguyen, Phuong Thao
AU - Tran, Hai Anh
AU - Nguyen, Huy Hieu
AU - Hoang, Nam Thang
AU - Mandal, Tulika
AU - Annareddy, Ruthvik
AU - Choudhary, Prithvi
AU - Tran, Truong X.
N1 - Publisher Copyright:
©2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In practical Edge-Cloud applications, deep learning models often face challenges when learning from continuously generated new data and labels from IoT devices. As a result, the model tends to struggle to retain previous knowledge while absorbing new knowledge, especially information from a new label in a classification task. This problem is also known as Class Incremental Learning (CIL). Among the approaches for CIL, Knowledge Distillation (KD) has emerged as a promising method and has achieved notable success. However, this approach may not effectively capture long-term dependencies and diverse feature representations, leading to suboptimal performance. This paper introduces the Multi-Teacher Multi-level Distillation (MTMLD) into CIL, utilizing multiple teachers to improve knowledge transfer at different levels, including feature, logit, and attention. Additionally, to ensure efficiency and scalability, we propose two novel modules: (1) a Student Reusability Module, which allows the student model to serve as a teacher in subsequent tasks, reducing redundancy, and (2) a Teacher Management Module, which regulates the number of teachers to prevent unnecessary computational overhead. Extensive experiments comparing our method to other algorithms on multiple datasets demonstrate that the proposed framework significantly outperforms existing methods in terms of accuracy and reduction of forgetting, achieving improvements of up to 2.15% compared to state-of-the-art algorithms on the CIFAR-100 dataset.
AB - In practical Edge-Cloud applications, deep learning models often face challenges when learning from continuously generated new data and labels from IoT devices. As a result, the model tends to struggle to retain previous knowledge while absorbing new knowledge, especially information from a new label in a classification task. This problem is also known as Class Incremental Learning (CIL). Among the approaches for CIL, Knowledge Distillation (KD) has emerged as a promising method and has achieved notable success. However, this approach may not effectively capture long-term dependencies and diverse feature representations, leading to suboptimal performance. This paper introduces the Multi-Teacher Multi-level Distillation (MTMLD) into CIL, utilizing multiple teachers to improve knowledge transfer at different levels, including feature, logit, and attention. Additionally, to ensure efficiency and scalability, we propose two novel modules: (1) a Student Reusability Module, which allows the student model to serve as a teacher in subsequent tasks, reducing redundancy, and (2) a Teacher Management Module, which regulates the number of teachers to prevent unnecessary computational overhead. Extensive experiments comparing our method to other algorithms on multiple datasets demonstrate that the proposed framework significantly outperforms existing methods in terms of accuracy and reduction of forgetting, achieving improvements of up to 2.15% compared to state-of-the-art algorithms on the CIFAR-100 dataset.
UR - https://www.scopus.com/pages/publications/105015434887
UR - https://www.scopus.com/pages/publications/105015434887#tab=citedBy
U2 - 10.1109/Cloud-Summit64795.2025.00022
DO - 10.1109/Cloud-Summit64795.2025.00022
M3 - Conference contribution
AN - SCOPUS:105015434887
T3 - Proceedings - 2025 IEEE Cloud Summit, Cloud-Summit 2025
SP - 95
EP - 100
BT - Proceedings - 2025 IEEE Cloud Summit, Cloud-Summit 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 June 2025 through 27 June 2025
ER -