TY - JOUR
T1 - Enhancing Malware Classification via Self-Similarity Techniques
AU - Zhong, Fangtian
AU - Hu, Qin
AU - Jiang, Yili
AU - Huang, Jiaqi
AU - Zhang, Cheng
AU - Wu, Dinghao
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Despite continuous advancements in defense mechanisms, attackers often find ways to circumvent security measures. Windows operating systems, in particular, are vulnerable due to fewer restrictions on downloading software from unknown sources, facilitating the spread of malware. To address this challenge, researchers have focused on developing techniques to identify Windows malware, crucial for mitigating potential damage. Traditional approaches typically categorize threats into broad classes such as trojans or adware, often failing to capture the full spectrum of malicious behaviors exhibited by diverse malware variants. In response, we propose a novel approach to malware categorization that incorporates both the general malware family and subfamily for each sample. Our method leverages self-similarity techniques to extract local semantics and similarities within the blocks of malware binaries while preserving correlations between these blocks. We utilize a VGG11 model to capture these features, enabling accurate classification. Central to our approach is the conversion of malware binaries into self-similarity descriptors, facilitating space savings while capturing essential semantics within blocks. By focusing on local self-similarities and their geometric layouts across malware, our method effectively identifies repetitive patterns indicative of malware behavior. Our proof-of-concept implementation demonstrates the effectiveness of our framework, achieving an impressive average precision of 98.2% on a newly gathered dataset with over 25,000 samples. Moreover, our method offers significant space savings, outperforming recent research efforts by a factor of over 96. These results underscore the efficacy of incorporating self-similarities and correlations within blocks for robust malware classification, making our approach a promising solution for real-world malware detection and prevention.
AB - Despite continuous advancements in defense mechanisms, attackers often find ways to circumvent security measures. Windows operating systems, in particular, are vulnerable due to fewer restrictions on downloading software from unknown sources, facilitating the spread of malware. To address this challenge, researchers have focused on developing techniques to identify Windows malware, crucial for mitigating potential damage. Traditional approaches typically categorize threats into broad classes such as trojans or adware, often failing to capture the full spectrum of malicious behaviors exhibited by diverse malware variants. In response, we propose a novel approach to malware categorization that incorporates both the general malware family and subfamily for each sample. Our method leverages self-similarity techniques to extract local semantics and similarities within the blocks of malware binaries while preserving correlations between these blocks. We utilize a VGG11 model to capture these features, enabling accurate classification. Central to our approach is the conversion of malware binaries into self-similarity descriptors, facilitating space savings while capturing essential semantics within blocks. By focusing on local self-similarities and their geometric layouts across malware, our method effectively identifies repetitive patterns indicative of malware behavior. Our proof-of-concept implementation demonstrates the effectiveness of our framework, achieving an impressive average precision of 98.2% on a newly gathered dataset with over 25,000 samples. Moreover, our method offers significant space savings, outperforming recent research efforts by a factor of over 96. These results underscore the efficacy of incorporating self-similarities and correlations within blocks for robust malware classification, making our approach a promising solution for real-world malware detection and prevention.
UR - https://www.scopus.com/pages/publications/85199540695
UR - https://www.scopus.com/pages/publications/85199540695#tab=citedBy
U2 - 10.1109/TIFS.2024.3433372
DO - 10.1109/TIFS.2024.3433372
M3 - Article
AN - SCOPUS:85199540695
SN - 1556-6013
VL - 19
SP - 7232
EP - 7244
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -