Enhancing LLM Capabilities Beyond Scaling Up

Wenpeng Yin, Muhao Chen, Rui Zhang, Ben Zhou, Fei Wang, Dan Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

General-purpose large language models (LLMs) are progressively expanding both in scale and access to unpublic training data. This has led to notable progress in a variety of AI problems. Nevertheless, two questions exist: i) Is scaling up the sole avenue of extending the capabilities of LLMs? ii) Instead of developing general-purpose LLMs, how to endow LLMs with specific knowledge? This tutorial targets researchers and practitioners who are interested in capability extension of LLMs that go beyond scaling up. To this end, we will discuss several lines of research that follow that direction, including: (i) optimizing input prompts to fully exploit LLM potential, (ii) enabling LLMs to self-improve responses through various feedback signals, (iii) updating or editing the internal knowledge of LLMs when necessary, (iv) leveraging incidental structural supervision from target tasks, and (v) defending against potential attacks and threats from malicious users. At last, we will conclude the tutorial by outlining directions for further investigation.

Original languageEnglish (US)
Title of host publicationEMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Tutorial Abstracts
EditorsJunyi Jessy Li, Fei Liu
PublisherAssociation for Computational Linguistics (ACL)
Pages1-10
Number of pages10
ISBN (Electronic)9798891761698
DOIs
StatePublished - 2024
Event2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Hybrid, Miami, United States
Duration: Nov 12 2024Nov 16 2024

Publication series

NameEMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Tutorial Abstracts

Conference

Conference2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Country/TerritoryUnited States
CityHybrid, Miami
Period11/12/2411/16/24

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems
  • Linguistics and Language
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Enhancing LLM Capabilities Beyond Scaling Up'. Together they form a unique fingerprint.

Cite this