TY - JOUR
T1 - CiM3D
T2 - Comparator-in-memory designs using monolithic 3-D technology for accelerating data-intensive applications
AU - Ramanathan, Akshay Krishna
AU - Rangachar, Srivatsa Srinivasa
AU - Govindarajan, Hariram Thirucherai
AU - Hung, Je Min
AU - Lee, Chun Ying
AU - Xue, Cheng Xin
AU - Huang, Sheng Po
AU - Hsueh, Fu Kuo
AU - Shen, Chang Hong
AU - Shieh, Jia Min
AU - Yeh, Wen Kuan
AU - Ho, Mon Shu
AU - Sampson, Jack
AU - Chang, Meng Fan
AU - Narayanan, Vijaykrishnan
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2021/6
Y1 - 2021/6
N2 - The compare operation is widely used in many applications, from fundamental sorting to primitive operations in the database and AI systems. We present SRAM-based 3-D-CAM circuit designs using a monolithic 3-D (M3D) integration process for realizing beyond-Boolean in-memory compare operation without any area overheads. We also fabricated a processing-in-memory (PiM) macro with the same 3-D-CAM circuit using M3D for performing massively parallel compare operations used in the database, machine learning, and scientific applications. We show various system designs with the 3-D-CAM supporting operations, such as data filtering, sorting, and sparse matrix-matrix multiplication (SpGEMM). Our systems exhibit up to 272 ×, 200 ×, and 226 × speedups and 151 ×, 37 ×, and 156 × energy savings compared to systems using near memory compute for the data filtering, sorting, and SpGEMM applications, respectively.
AB - The compare operation is widely used in many applications, from fundamental sorting to primitive operations in the database and AI systems. We present SRAM-based 3-D-CAM circuit designs using a monolithic 3-D (M3D) integration process for realizing beyond-Boolean in-memory compare operation without any area overheads. We also fabricated a processing-in-memory (PiM) macro with the same 3-D-CAM circuit using M3D for performing massively parallel compare operations used in the database, machine learning, and scientific applications. We show various system designs with the 3-D-CAM supporting operations, such as data filtering, sorting, and sparse matrix-matrix multiplication (SpGEMM). Our systems exhibit up to 272 ×, 200 ×, and 226 × speedups and 151 ×, 37 ×, and 156 × energy savings compared to systems using near memory compute for the data filtering, sorting, and SpGEMM applications, respectively.
UR - http://www.scopus.com/inward/record.url?scp=85111001793&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111001793&partnerID=8YFLogxK
U2 - 10.1109/JXCDC.2021.3087745
DO - 10.1109/JXCDC.2021.3087745
M3 - Article
AN - SCOPUS:85111001793
SN - 2329-9231
VL - 7
SP - 79
EP - 87
JO - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
JF - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
IS - 1
ER -