TY - GEN
T1 - AdaDHP
T2 - 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
AU - Liu, Han
AU - Li, Changya
AU - Zhang, Xiaotong
AU - Zhang, Feng
AU - Ma, Fenglong
AU - Wang, Wei
AU - Yu, Hong
N1 - Publisher Copyright:
© 2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - With the continuously expanding parameters, efficiently adapting large language models to downstream tasks is crucial in resource-limited conditions. Many parameter-efficient finetuning methods have emerged to address this challenge. However, they lack flexibility, like LoRA requires manually selecting trainable parameters and rank size, (IA)3 can only scale the activations along columns, yielding inferior results due to less precise fine-tuning. To address these issues, we propose a novel method named AdaDHP with fewer parameters and finer granularity, which can adaptively select important parameters for each task. Specifically, we introduce two trainable vectors for each parameter and fine-tune the parameters through Hadamard product along both rows and columns. This significantly reduces the number of trainable parameters, with our parameter count capped at the lower limit of LoRA. Moreover, we design an adaptive parameter selection strategy to select important parameters for downstream tasks dynamically. This allows our method to flexibly remove unimportant parameters for downstream tasks. Finally, we demonstrate the superiority of our method on the T5-base model across 17 NLU tasks and on complex mathematical tasks with the Llama series models.
AB - With the continuously expanding parameters, efficiently adapting large language models to downstream tasks is crucial in resource-limited conditions. Many parameter-efficient finetuning methods have emerged to address this challenge. However, they lack flexibility, like LoRA requires manually selecting trainable parameters and rank size, (IA)3 can only scale the activations along columns, yielding inferior results due to less precise fine-tuning. To address these issues, we propose a novel method named AdaDHP with fewer parameters and finer granularity, which can adaptively select important parameters for each task. Specifically, we introduce two trainable vectors for each parameter and fine-tune the parameters through Hadamard product along both rows and columns. This significantly reduces the number of trainable parameters, with our parameter count capped at the lower limit of LoRA. Moreover, we design an adaptive parameter selection strategy to select important parameters for downstream tasks dynamically. This allows our method to flexibly remove unimportant parameters for downstream tasks. Finally, we demonstrate the superiority of our method on the T5-base model across 17 NLU tasks and on complex mathematical tasks with the Llama series models.
UR - https://www.scopus.com/pages/publications/105021018001
UR - https://www.scopus.com/pages/publications/105021018001#tab=citedBy
U2 - 10.18653/v1/2025.acl-long.467
DO - 10.18653/v1/2025.acl-long.467
M3 - Conference contribution
AN - SCOPUS:105021018001
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 9492
EP - 9504
BT - Long Papers
A2 - Che, Wanxiang
A2 - Nabende, Joyce
A2 - Shutova, Ekaterina
A2 - Pilehvar, Mohammad Taher
PB - Association for Computational Linguistics (ACL)
Y2 - 27 July 2025 through 1 August 2025
ER -