TY - GEN
T1 - FAAStloop
T2 - 15th Annual ACM Symposium on Cloud Computing, SoCC 2024
AU - Mohanty, Shruti
AU - Bhasi, Vivek M.
AU - Son, Myungjun
AU - Kandemir, Mahmut Taylan
AU - Das, Chita
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/11/20
Y1 - 2024/11/20
N2 - Serverless Computing has garnered significant interest for executing High-Performance Computing (HPC) applications in recent years, attracting attention for its elastic scalability, reduced entry barriers, and pay-per-use pricing model. Specifically, highly parallel HPC apps can be divided and offloaded to multiple Serverless Functions (SFs) that execute their respective tasks concurrently and, finally, their results are stored/aggregated. While state-of-the-art userside serverless frameworks have attempted to fine-tune task division amongst the SFs to optimize for performance and/or cost, they have either used static task division parameters or have only focused on minimizing the number of SFs through task packing. However, these methods treat the HPC code as a black-box and usually require significant manual intervention to find the optimal task division. Since a significant portion of the HPC applications have a loop structure, in this work, we try to answer the following two questions: (i) Can modifying the loop structure in the HPC code, originally optimized for monolithic (non-serverless) frameworks, enhance performance and reduce costs in a serverless architecture?, and (ii) Can we develop a framework that allows for an efficient transition of monolithic code to serverless, with minimum user input? To this end, we propose a novel framework, FAAStloop, which intelligently employs loop-based optimizations (as well as task packing) in SF containers to optimally execute HPC apps across SFs. FAAStloop chooses the relevant optimization parameters using statistical models (constructed via app profiling) that are able to predict the relevant performance/cost metrics as a function of our choice of parameters. Our extensive experimental evaluation of FAAStloop on the AWS Lambda platform reveals that our framework outperforms state-of-the-art works by up to 3.3× and 2.1×, in terms of end-to-end execution latency and cost, respectively.
AB - Serverless Computing has garnered significant interest for executing High-Performance Computing (HPC) applications in recent years, attracting attention for its elastic scalability, reduced entry barriers, and pay-per-use pricing model. Specifically, highly parallel HPC apps can be divided and offloaded to multiple Serverless Functions (SFs) that execute their respective tasks concurrently and, finally, their results are stored/aggregated. While state-of-the-art userside serverless frameworks have attempted to fine-tune task division amongst the SFs to optimize for performance and/or cost, they have either used static task division parameters or have only focused on minimizing the number of SFs through task packing. However, these methods treat the HPC code as a black-box and usually require significant manual intervention to find the optimal task division. Since a significant portion of the HPC applications have a loop structure, in this work, we try to answer the following two questions: (i) Can modifying the loop structure in the HPC code, originally optimized for monolithic (non-serverless) frameworks, enhance performance and reduce costs in a serverless architecture?, and (ii) Can we develop a framework that allows for an efficient transition of monolithic code to serverless, with minimum user input? To this end, we propose a novel framework, FAAStloop, which intelligently employs loop-based optimizations (as well as task packing) in SF containers to optimally execute HPC apps across SFs. FAAStloop chooses the relevant optimization parameters using statistical models (constructed via app profiling) that are able to predict the relevant performance/cost metrics as a function of our choice of parameters. Our extensive experimental evaluation of FAAStloop on the AWS Lambda platform reveals that our framework outperforms state-of-the-art works by up to 3.3× and 2.1×, in terms of end-to-end execution latency and cost, respectively.
UR - https://www.scopus.com/pages/publications/85215507435
UR - https://www.scopus.com/pages/publications/85215507435#tab=citedBy
U2 - 10.1145/3698038.3698560
DO - 10.1145/3698038.3698560
M3 - Conference contribution
AN - SCOPUS:85215507435
T3 - SoCC 2024 - Proceedings of the 2024 ACM Symposium on Cloud Computing
SP - 943
EP - 960
BT - SoCC 2024 - Proceedings of the 2024 ACM Symposium on Cloud Computing
PB - Association for Computing Machinery, Inc
Y2 - 20 November 2024 through 22 November 2024
ER -