Enhancing Parallelism in Commercial PIM DRAM with LUT-Based Design

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Real-world industrial Processing-in-Memory (PIM) devices, such as UPMEM PIM-DIMM, Samsung HBM-PIM, and SK Hynix AiM, integrate computation within memory modules to alleviate data transfer bottlenecks. However, area density constraints often limit the uniform deployment of compute units across all dies. Opportunistically, we propose a Multi-Mode PIM design that exploits non-compute dies by storing precomputed results in look-up tables. Our approach achieves high parallelism for low-precision computations using otherwise idle dies, while maintaining minimal area overhead. In our evaluation, the Multi-Mode PIM achieves up to 19.3× speedup over HBM-PIM and 26.1× over BLIMP for low-precision workloads, and up to 2.12× and 3.98×, respectively, for mixed-precision workloads.

Original languageEnglish (US)
Title of host publicationGLSVLSI 2025 - Proceedings of the Great Lakes Symposium on VLSI 2025
PublisherAssociation for Computing Machinery
Pages399-400
Number of pages2
ISBN (Electronic)9798400714962
DOIs
StatePublished - Jun 29 2025
Event35th Edition of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025 - New Orleans, United States
Duration: Jun 30 2025Jul 2 2025

Publication series

NameProceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI

Conference

Conference35th Edition of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025
Country/TerritoryUnited States
CityNew Orleans
Period6/30/257/2/25

All Science Journal Classification (ASJC) codes

  • General Engineering

Fingerprint

Dive into the research topics of 'Enhancing Parallelism in Commercial PIM DRAM with LUT-Based Design'. Together they form a unique fingerprint.

Cite this