A Survey on Small Language Models in the Era of Large Language Models: Architecture, Capabilities, and Trustworthiness

  • Fali Wang
  • , Minhua Lin
  • , Yao Ma
  • , Hui Liu
  • , Qi He
  • , Xianfeng Tang
  • , Jiliang Tang
  • , Jian Pei
  • , Suhang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Large language models (LLMs) based on Transformer architecture are powerful but face challenges with deployment, inference latency, and costly fine-tuning. These limitations highlight the emerging potential of small language models (SLMs), which can either replace LLMs through innovative architectures and technologies, or assist them as efficient proxy or reward models. Emerging architectures such as Mamba and xLSTM address the quadratic scaling of inference with window length in Transformers by enabling linear scaling. To maximize SLM performance, test-time compute scaling strategies reduce the performance gap with LLMs by allocating extra compute budget during test time. Beyond standalone usage, SLMs could also assist in LLMs via weak-to-strong learning, proxy tuning, and guarding, fostering secure and efficient LLM deployment. Lastly, the trustworthiness of SLMs remains a critical yet underexplored research area. However, there is a lack of tutorials on cutting-edge SLM technologies, prompting us to conduct one.

Original languageEnglish (US)
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages6173-6183
Number of pages11
ISBN (Electronic)9798400714542
DOIs
StatePublished - Aug 3 2025
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, Canada
Duration: Aug 3 2025Aug 7 2025

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
Country/TerritoryCanada
CityToronto
Period8/3/258/7/25

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'A Survey on Small Language Models in the Era of Large Language Models: Architecture, Capabilities, and Trustworthiness'. Together they form a unique fingerprint.

Cite this