TY - JOUR
T1 - FAST
T2 - A Fully-Concurrent Access SRAM Topology for High Row-Wise Parallelism Applications Based on Dynamic Shift Operations
AU - Chen, Yiming
AU - Fu, Yushen
AU - Lee, Mingyen
AU - George, Sumitha
AU - Liu, Yongpan
AU - Narayanan, Vijaykrishnan
AU - Yang, Huazhong
AU - Li, Xueqing
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2023/4/1
Y1 - 2023/4/1
N2 - This brief proposes a fully-concurrent access SRAM topology to handle high-concurrency operations on multiple rows in an SRAM array. Such high-concurrency operations are widely seen in both conventional and emerging applications where high parallelism is preferred, e.g., the table update in a database and the parallel feature update in graph computing. The proposed shift-based parallel access and compute architecture is enabled by integrating the shifter function into each SRAM cell, and by creating a datapath that exploits the high-parallelism of shift operations in multiple rows. An example of a 128-row 16-column shiftable SRAM in 65nm CMOS is designed. Post-layout SPICE simulations show improvements of 5.5x energy efficiency and 27.2x speed in average over a conventional digital near-memory computing scheme. In addition, the design has been fabricated and the measurement results show support of up to 800MHz clock at 1.0V and 1.2GHz at 1.2V.
AB - This brief proposes a fully-concurrent access SRAM topology to handle high-concurrency operations on multiple rows in an SRAM array. Such high-concurrency operations are widely seen in both conventional and emerging applications where high parallelism is preferred, e.g., the table update in a database and the parallel feature update in graph computing. The proposed shift-based parallel access and compute architecture is enabled by integrating the shifter function into each SRAM cell, and by creating a datapath that exploits the high-parallelism of shift operations in multiple rows. An example of a 128-row 16-column shiftable SRAM in 65nm CMOS is designed. Post-layout SPICE simulations show improvements of 5.5x energy efficiency and 27.2x speed in average over a conventional digital near-memory computing scheme. In addition, the design has been fabricated and the measurement results show support of up to 800MHz clock at 1.0V and 1.2GHz at 1.2V.
UR - https://www.scopus.com/pages/publications/85146240921
UR - https://www.scopus.com/inward/citedby.url?scp=85146240921&partnerID=8YFLogxK
U2 - 10.1109/TCSII.2022.3231589
DO - 10.1109/TCSII.2022.3231589
M3 - Article
AN - SCOPUS:85146240921
SN - 1549-7747
VL - 70
SP - 1605
EP - 1609
JO - IEEE Transactions on Circuits and Systems II: Express Briefs
JF - IEEE Transactions on Circuits and Systems II: Express Briefs
IS - 4
ER -