TY - JOUR
T1 - Specializing Neural Networks for Cryptographic Code Completion Applications
AU - Xiao, Ya
AU - Song, Wenjia
AU - Qi, Jingyuan
AU - Viswanath, Bimal
AU - McDaniel, Patrick
AU - Yao, Danfeng
N1 - Publisher Copyright:
© 1976-2012 IEEE.
PY - 2023/6/1
Y1 - 2023/6/1
N2 - Similarities between natural languages and programming languages have prompted researchers to apply neural network models to software problems, such as code generation and repair. However, program-specific characteristics pose unique prediction challenges that require the design of new and specialized neural network solutions. In this work, we identify new prediction challenges in application programming interface (API) completion tasks and find that existing solutions are unable to capture complex program dependencies in program semantics and structures. We design a new neural network model Multi-HyLSTM to overcome the newly identified challenges and comprehend complex dependencies between API calls. Our neural network is empowered with a specialized dataflow analysis to extract multiple global API dependence paths for neural network predictions. We evaluate Multi-HyLSTM on 64,478 Android Apps and predict 774,460 Java cryptographic API calls that are usually challenging for developers to use correctly. Our Multi-HyLSTM achieves an excellent top-1 API completion accuracy at 98.99%. Moreover, we show the effectiveness of our design choices through an ablation study and have released our dataset.
AB - Similarities between natural languages and programming languages have prompted researchers to apply neural network models to software problems, such as code generation and repair. However, program-specific characteristics pose unique prediction challenges that require the design of new and specialized neural network solutions. In this work, we identify new prediction challenges in application programming interface (API) completion tasks and find that existing solutions are unable to capture complex program dependencies in program semantics and structures. We design a new neural network model Multi-HyLSTM to overcome the newly identified challenges and comprehend complex dependencies between API calls. Our neural network is empowered with a specialized dataflow analysis to extract multiple global API dependence paths for neural network predictions. We evaluate Multi-HyLSTM on 64,478 Android Apps and predict 774,460 Java cryptographic API calls that are usually challenging for developers to use correctly. Our Multi-HyLSTM achieves an excellent top-1 API completion accuracy at 98.99%. Moreover, we show the effectiveness of our design choices through an ablation study and have released our dataset.
UR - http://www.scopus.com/inward/record.url?scp=85153334153&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85153334153&partnerID=8YFLogxK
U2 - 10.1109/TSE.2023.3265362
DO - 10.1109/TSE.2023.3265362
M3 - Article
AN - SCOPUS:85153334153
SN - 0098-5589
VL - 49
SP - 3524
EP - 3535
JO - IEEE Transactions on Software Engineering
JF - IEEE Transactions on Software Engineering
IS - 6
ER -