TY - JOUR
T1 - ICARE
T2 - cross-domain text classification with incremental class-aware representation and distillation learning
AU - Duong, Son
AU - Tran, Truong X.
AU - Tran, Hai Anh
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025
Y1 - 2025
N2 - The rise of powerful, fine-tuned large language models (LLMs) necessitates robust natural language inference (NLI) systems. These NLI systems ideally train across various tasks, encompassing diverse concepts like sentiment analysis, news classification, and coreference resolution. However, this continual, multi-tasking approach can lead to catastrophic forgetting, where previously learned tasks are lost, especially in domains lacking specific labels. Rehearsing past examples with labels can further exacerbate this issue by causing overfitting, hindering the system’s ability to classify new categories. To tackle these challenges, this research proposes ICARE, a novel framework with two key components: an instance selection module and a knowledge distillation method. The former is based on representation learning, which meticulously selects training examples to present to the NLI system. The latter, knowledge distillation with feature and logit-level transfer, facilitates knowledge transfer between tasks. This leverages pruning techniques and optimized objective functions to transfer both essential feature-level and decision-making (logit-level) knowledge, mitigating catastrophic forgetting while adapting to new patterns. Extensive evaluations on benchmark text classification datasets have demonstrated ICARE’s effectiveness. The framework achieves a significant improvement in target accuracy and reduction in the “forgetting metric”. These results solidify ICARE’s ability to preserve and leverage learned knowledge across diverse tasks, ultimately enhancing the adaptability and performance of NLI systems. The source code is available at https://github.com/CongSon01/ICARE_NLP.git.
AB - The rise of powerful, fine-tuned large language models (LLMs) necessitates robust natural language inference (NLI) systems. These NLI systems ideally train across various tasks, encompassing diverse concepts like sentiment analysis, news classification, and coreference resolution. However, this continual, multi-tasking approach can lead to catastrophic forgetting, where previously learned tasks are lost, especially in domains lacking specific labels. Rehearsing past examples with labels can further exacerbate this issue by causing overfitting, hindering the system’s ability to classify new categories. To tackle these challenges, this research proposes ICARE, a novel framework with two key components: an instance selection module and a knowledge distillation method. The former is based on representation learning, which meticulously selects training examples to present to the NLI system. The latter, knowledge distillation with feature and logit-level transfer, facilitates knowledge transfer between tasks. This leverages pruning techniques and optimized objective functions to transfer both essential feature-level and decision-making (logit-level) knowledge, mitigating catastrophic forgetting while adapting to new patterns. Extensive evaluations on benchmark text classification datasets have demonstrated ICARE’s effectiveness. The framework achieves a significant improvement in target accuracy and reduction in the “forgetting metric”. These results solidify ICARE’s ability to preserve and leverage learned knowledge across diverse tasks, ultimately enhancing the adaptability and performance of NLI systems. The source code is available at https://github.com/CongSon01/ICARE_NLP.git.
UR - http://www.scopus.com/inward/record.url?scp=105000438552&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105000438552&partnerID=8YFLogxK
U2 - 10.1007/s13042-025-02572-6
DO - 10.1007/s13042-025-02572-6
M3 - Article
AN - SCOPUS:105000438552
SN - 1868-8071
JO - International Journal of Machine Learning and Cybernetics
JF - International Journal of Machine Learning and Cybernetics
M1 - 110156
ER -