TY - JOUR
T1 - Empirical evaluation of multi-task learning in deep neural networks for natural language processing
AU - Li, Jianquan
AU - Liu, Xiaokang
AU - Yin, Wenpeng
AU - Yang, Min
AU - Ma, Liqun
AU - Jin, Yaohong
N1 - Publisher Copyright:
© 2020, Springer-Verlag London Ltd., part of Springer Nature.
PY - 2021/5
Y1 - 2021/5
N2 - Multi-task learning (MTL) aims at boosting the overall performance of each individual task by leveraging useful information contained in multiple-related tasks. It has shown great success in natural language processing (NLP). Currently, a number of MTL architectures and learning mechanisms have been proposed for various NLP tasks, including exploring linguistic hierarchies, orthogonality constraints, adversarial learning, gate mechanism, and label embedding. However, there is no systematic exploration and comparison of different MTL architectures and learning mechanisms for their strong performance in-depth. In this paper, we conduct a thorough examination of five typical MTL methods with deep learning architectures for a broad range of representative NLP tasks. Our primary goal is to understand the merits and demerits of existing MTL methods in NLP tasks, thus devising new hybrid architectures intended to combine their strengths. Following the empirical evaluation, we offer our insights and conclusions regarding the MTL methods we have considered.
AB - Multi-task learning (MTL) aims at boosting the overall performance of each individual task by leveraging useful information contained in multiple-related tasks. It has shown great success in natural language processing (NLP). Currently, a number of MTL architectures and learning mechanisms have been proposed for various NLP tasks, including exploring linguistic hierarchies, orthogonality constraints, adversarial learning, gate mechanism, and label embedding. However, there is no systematic exploration and comparison of different MTL architectures and learning mechanisms for their strong performance in-depth. In this paper, we conduct a thorough examination of five typical MTL methods with deep learning architectures for a broad range of representative NLP tasks. Our primary goal is to understand the merits and demerits of existing MTL methods in NLP tasks, thus devising new hybrid architectures intended to combine their strengths. Following the empirical evaluation, we offer our insights and conclusions regarding the MTL methods we have considered.
UR - http://www.scopus.com/inward/record.url?scp=85089081846&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85089081846&partnerID=8YFLogxK
U2 - 10.1007/s00521-020-05268-w
DO - 10.1007/s00521-020-05268-w
M3 - Article
AN - SCOPUS:85089081846
SN - 0941-0643
VL - 33
SP - 4417
EP - 4428
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 9
ER -