TY - GEN
T1 - Leveraging LLM to Detect and Correct Vulnerabilities in Code
AU - Zia, Ilmaan
AU - Basit Ur Rahim, Muhammad Abdul
AU - Raj, Ankit
AU - Hussain, Shahid
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Artificial intelligence has significantly affected various domains recently, notably code-level application security. Vulnerable code snippets can easily propagate across different software platforms, making early detection and correction crucial. Existing machine learning applications often fail to provide comprehensive and accurate results, necessitating time-consuming manual inspections by developers. This paper explores using a fine-tuned large language model (LLM) to detect vulnerabilities in source code. Leveraging the Mistral LLM and LangGraph, the model was trained on a custom dataset to improve performance in identifying and addressing code vulnerabilities. The approach involves fine-tuning models to classify code as vulnerable or non-vulnerable, identifying the Common Weakness Enumeration (CWE) for vulnerable code, and generating secure replacements. The study compares the performance of fine-tuned models with standard LLMs and other detection tools, highlighting the limitations of existing methods. The results demonstrate improved accuracy, reduced false positive and negative rates, and maintained code privacy and security within the local system. This advancement underscores the potential of fine-tuned LLMs and iterative frameworks like LangGraph to improve code security in modern software development.
AB - Artificial intelligence has significantly affected various domains recently, notably code-level application security. Vulnerable code snippets can easily propagate across different software platforms, making early detection and correction crucial. Existing machine learning applications often fail to provide comprehensive and accurate results, necessitating time-consuming manual inspections by developers. This paper explores using a fine-tuned large language model (LLM) to detect vulnerabilities in source code. Leveraging the Mistral LLM and LangGraph, the model was trained on a custom dataset to improve performance in identifying and addressing code vulnerabilities. The approach involves fine-tuning models to classify code as vulnerable or non-vulnerable, identifying the Common Weakness Enumeration (CWE) for vulnerable code, and generating secure replacements. The study compares the performance of fine-tuned models with standard LLMs and other detection tools, highlighting the limitations of existing methods. The results demonstrate improved accuracy, reduced false positive and negative rates, and maintained code privacy and security within the local system. This advancement underscores the potential of fine-tuned LLMs and iterative frameworks like LangGraph to improve code security in modern software development.
UR - https://www.scopus.com/pages/publications/105013619908
UR - https://www.scopus.com/pages/publications/105013619908#tab=citedBy
U2 - 10.1007/978-3-031-95127-5_24
DO - 10.1007/978-3-031-95127-5_24
M3 - Conference contribution
AN - SCOPUS:105013619908
SN - 9783031951268
T3 - Communications in Computer and Information Science
SP - 324
EP - 339
BT - Computational Science and Computational Intelligence - 11th International Conference, CSCI 2024, Proceedings
A2 - Arabnia, Hamid R.
A2 - Deligiannidis, Leonidas
A2 - Shenavarmasouleh, Farzan
A2 - Amirian, Soheyla
A2 - Ghareh Mohammadi, Farid
PB - Springer Science and Business Media Deutschland GmbH
T2 - 11th International Conference on Computational Science and Computational Intelligence, CSCI 2024
Y2 - 11 December 2024 through 13 December 2024
ER -