TY - GEN
T1 - Enhancing LLM Capabilities Beyond Scaling Up
AU - Yin, Wenpeng
AU - Chen, Muhao
AU - Zhang, Rui
AU - Zhou, Ben
AU - Wang, Fei
AU - Roth, Dan
N1 - Publisher Copyright:
©2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - General-purpose large language models (LLMs) are progressively expanding both in scale and access to unpublic training data. This has led to notable progress in a variety of AI problems. Nevertheless, two questions exist: i) Is scaling up the sole avenue of extending the capabilities of LLMs? ii) Instead of developing general-purpose LLMs, how to endow LLMs with specific knowledge? This tutorial targets researchers and practitioners who are interested in capability extension of LLMs that go beyond scaling up. To this end, we will discuss several lines of research that follow that direction, including: (i) optimizing input prompts to fully exploit LLM potential, (ii) enabling LLMs to self-improve responses through various feedback signals, (iii) updating or editing the internal knowledge of LLMs when necessary, (iv) leveraging incidental structural supervision from target tasks, and (v) defending against potential attacks and threats from malicious users. At last, we will conclude the tutorial by outlining directions for further investigation.
AB - General-purpose large language models (LLMs) are progressively expanding both in scale and access to unpublic training data. This has led to notable progress in a variety of AI problems. Nevertheless, two questions exist: i) Is scaling up the sole avenue of extending the capabilities of LLMs? ii) Instead of developing general-purpose LLMs, how to endow LLMs with specific knowledge? This tutorial targets researchers and practitioners who are interested in capability extension of LLMs that go beyond scaling up. To this end, we will discuss several lines of research that follow that direction, including: (i) optimizing input prompts to fully exploit LLM potential, (ii) enabling LLMs to self-improve responses through various feedback signals, (iii) updating or editing the internal knowledge of LLMs when necessary, (iv) leveraging incidental structural supervision from target tasks, and (v) defending against potential attacks and threats from malicious users. At last, we will conclude the tutorial by outlining directions for further investigation.
UR - http://www.scopus.com/inward/record.url?scp=85217535763&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85217535763&partnerID=8YFLogxK
U2 - 10.18653/v1/2024.emnlp-tutorials.1
DO - 10.18653/v1/2024.emnlp-tutorials.1
M3 - Conference contribution
AN - SCOPUS:85217535763
T3 - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Tutorial Abstracts
SP - 1
EP - 10
BT - EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Tutorial Abstracts
A2 - Li, Junyi Jessy
A2 - Liu, Fei
PB - Association for Computational Linguistics (ACL)
T2 - 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Y2 - 12 November 2024 through 16 November 2024
ER -