TY - GEN
T1 - Fast and Accurate DNN Performance Estimation across Diverse Hardware Platforms
AU - Kakrannaya, Vishwas Vasudeva
AU - Rai, Siddhartha Balakrishna
AU - Sivasubramaniam, Anand
AU - Zhu, Timothy
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Performance modeling is an important tool for many purposes such as designing hardware accelerators, improving scheduling, optimizing system parameters, procuring new hardware, etc. This paper provides a new methodology for constructing performance models for Deep Neural Networks (DNNs), a popular machine learning workload. Prior works require running DNNs on existing hardware, which may not be available, or simulating the computation on futuristic hardware, which is slow and not scalable. We instead take an analytical approach based on analyzing the raw operations within DNN algorithms, which allows us to estimate performance across any hardware, even hardware that is in the process of being designed. Evaluations show our approach is fast and gives a good first order approximation pm 10-15% accuracy) across many DNNs and hardware platforms including GPUs, CPUs, and a futuristic Processing In Memory (PIM) accelerator called BLIMP.
AB - Performance modeling is an important tool for many purposes such as designing hardware accelerators, improving scheduling, optimizing system parameters, procuring new hardware, etc. This paper provides a new methodology for constructing performance models for Deep Neural Networks (DNNs), a popular machine learning workload. Prior works require running DNNs on existing hardware, which may not be available, or simulating the computation on futuristic hardware, which is slow and not scalable. We instead take an analytical approach based on analyzing the raw operations within DNN algorithms, which allows us to estimate performance across any hardware, even hardware that is in the process of being designed. Evaluations show our approach is fast and gives a good first order approximation pm 10-15% accuracy) across many DNNs and hardware platforms including GPUs, CPUs, and a futuristic Processing In Memory (PIM) accelerator called BLIMP.
UR - http://www.scopus.com/inward/record.url?scp=85215106794&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85215106794&partnerID=8YFLogxK
U2 - 10.1109/MASCOTS64422.2024.10786578
DO - 10.1109/MASCOTS64422.2024.10786578
M3 - Conference contribution
AN - SCOPUS:85215106794
T3 - Proceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS
BT - Proceedings - 2024 IEEE 32nd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2024
PB - IEEE Computer Society
T2 - 32nd IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2024
Y2 - 21 October 2024 through 23 October 2024
ER -