TY - GEN
T1 - MalwareTotal
T2 - 44th ACM/IEEE International Conference on Software Engineering, ICSE 2024
AU - He, Shuai
AU - Fu, Cai
AU - Hu, Hong
AU - Chen, Jiahe
AU - Lv, Jianqiang
AU - Jiang, Shuai
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024
Y1 - 2024
N2 - Recent methods have demonstrated that machine learning (ML) based static malware detection models are vulnerable to adversarial attacks. However, the generated malware often fails to generalize to production-level anti-malware software (AMS), as they usually involve multiple detection methods. This calls for universal solutions to the problem of malware variants generation. In this work, we demonstrate how the proposed method, MalwareTotal, has allowed malware variants to continue to abound in ML-based, signature-based, and hybrid anti-malware software. Given a malicious binary, we develop sequential bypass tactics that enable malicious behavior to be concealed within multi-faceted manipulations. Through 12 experiments on real-world malware, we demonstrate that an attacker can consistently bypass detection (98.67%, and 100% attack success rate against ML-based methods EMBER and MalConv, respectively; 95.33%, 92.63%, and 98.52% attack success rate against production-level anti-malware software ClamAV, AMS A, and AMS B, respectively) without modifying the malware functionality. We further demonstrate that our approach outperforms state-of-the-art adversarial malware generation techniques both in attack success rate and query consumption (the number of queries to the target model). Moreover, the samples generated by our method have demonstrated transferability in the real-world integrated malware detector, VirusTotal. In addition, we show that common mitigation such as adversarial training on known attacks cannot effectively defend against the proposed attack. Finally, we investigate the value of the generated adversarial examples as a means of hardening victim models through an adversarial training procedure, and demonstrate that the accuracy of the retrained model against generated adversarial examples increases by 88.51 percentage points.
AB - Recent methods have demonstrated that machine learning (ML) based static malware detection models are vulnerable to adversarial attacks. However, the generated malware often fails to generalize to production-level anti-malware software (AMS), as they usually involve multiple detection methods. This calls for universal solutions to the problem of malware variants generation. In this work, we demonstrate how the proposed method, MalwareTotal, has allowed malware variants to continue to abound in ML-based, signature-based, and hybrid anti-malware software. Given a malicious binary, we develop sequential bypass tactics that enable malicious behavior to be concealed within multi-faceted manipulations. Through 12 experiments on real-world malware, we demonstrate that an attacker can consistently bypass detection (98.67%, and 100% attack success rate against ML-based methods EMBER and MalConv, respectively; 95.33%, 92.63%, and 98.52% attack success rate against production-level anti-malware software ClamAV, AMS A, and AMS B, respectively) without modifying the malware functionality. We further demonstrate that our approach outperforms state-of-the-art adversarial malware generation techniques both in attack success rate and query consumption (the number of queries to the target model). Moreover, the samples generated by our method have demonstrated transferability in the real-world integrated malware detector, VirusTotal. In addition, we show that common mitigation such as adversarial training on known attacks cannot effectively defend against the proposed attack. Finally, we investigate the value of the generated adversarial examples as a means of hardening victim models through an adversarial training procedure, and demonstrate that the accuracy of the retrained model against generated adversarial examples increases by 88.51 percentage points.
UR - http://www.scopus.com/inward/record.url?scp=85196802469&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85196802469&partnerID=8YFLogxK
U2 - 10.1145/3597503.3639141
DO - 10.1145/3597503.3639141
M3 - Conference contribution
AN - SCOPUS:85196802469
T3 - Proceedings - International Conference on Software Engineering
SP - 2123
EP - 2134
BT - Proceedings - 2024 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2024
PB - IEEE Computer Society
Y2 - 14 April 2024 through 20 April 2024
ER -