PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep Learning Recommendation Models (DLRMs) have become increasingly popular and prevalent in today's datacenters, consuming most of the AI inference cycles. The performance of DLRMs is heavily influenced by available band-width due to their large vector sizes in embedding tables and concurrent accesses. To achieve substantial improvements over existing solutions, novel approaches towards DLRM optimization are needed, especially, in the context of emerging interconnect technologies like CXL. This study delves into exploring CXL-enabled systems, implementing a process-in-fabric-switch (PIFS) solution to accelerate DLRMs while optimizing their memory and bandwidth scalability. We present an in-depth characterization of industry-scale DLRM workloads running on CXL-ready systems, identifying the predominant bottlenecks in existing CXL systems. We, therefore, propose PIFS-Rec, a PIFS-based scheme that implements near-data processing through downstream ports of the fabric switch. PIFS-Rec achieves a latency that is 3.89 x lower than Pond, an industry-standard CXL-based system, and also outperforms BEACON, a state-of-The-Art scheme, by 2.03x.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024
PublisherIEEE Computer Society
Pages612-626
Number of pages15
ISBN (Electronic)9798350350579
DOIs
StatePublished - 2024
Event57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024 - Austin, United States
Duration: Nov 2 2024Nov 6 2024

Publication series

NameProceedings of the Annual International Symposium on Microarchitecture, MICRO
ISSN (Print)1072-4451

Conference

Conference57th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2024
Country/TerritoryUnited States
CityAustin
Period11/2/2411/6/24

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences'. Together they form a unique fingerprint.

Cite this