TY - GEN
T1 - POSTER
T2 - 19th ACM Asia Conference on Computer and Communications Security, AsiaCCS 2024
AU - Jiang, Fengqing
AU - Xu, Zhangchen
AU - Niu, Luyao
AU - Wang, Boxin
AU - Jia, Jinyuan
AU - Li, Bo
AU - Poovendran, Radha
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/7/1
Y1 - 2024/7/1
N2 - Compared with the traditional usage of large language models (LLMs) where users directly send queries to an LLM, LLM-integrated applications serve as middleware to refine users’ queries with domain-specific knowledge to better inform LLMs and enhance the responses. However, LLM-integrated applications also introduce new attack surfaces. This work considers a setup where the user and LLM interact via an application in the middle. We focus on the interactions that begin with user’s queries and end with LLM-integrated application returning responses to the queries, powered by LLMs at the service backend. We identify potential high-risk vulnerabilities in this setting that can originate from the malicious application developer or from an outsider threat initiator that can control the database access, manipulate and poison high-risk data for the user. Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator. We assess such threats against LLM-integrated applications empowered by GPT-3.5 and GPT-4. Our experiments show that the threats can effectively bypass the restrictions and moderation policies of OpenAI, resulting in users exposing to the risk of bias, toxic content, privacy, and disinformation. We develop a lightweight, threat-agnostic defense to mitigate insider and outsider threats. Our evaluations demonstrate the efficacy of our defense.
AB - Compared with the traditional usage of large language models (LLMs) where users directly send queries to an LLM, LLM-integrated applications serve as middleware to refine users’ queries with domain-specific knowledge to better inform LLMs and enhance the responses. However, LLM-integrated applications also introduce new attack surfaces. This work considers a setup where the user and LLM interact via an application in the middle. We focus on the interactions that begin with user’s queries and end with LLM-integrated application returning responses to the queries, powered by LLMs at the service backend. We identify potential high-risk vulnerabilities in this setting that can originate from the malicious application developer or from an outsider threat initiator that can control the database access, manipulate and poison high-risk data for the user. Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator. We assess such threats against LLM-integrated applications empowered by GPT-3.5 and GPT-4. Our experiments show that the threats can effectively bypass the restrictions and moderation policies of OpenAI, resulting in users exposing to the risk of bias, toxic content, privacy, and disinformation. We develop a lightweight, threat-agnostic defense to mitigate insider and outsider threats. Our evaluations demonstrate the efficacy of our defense.
UR - http://www.scopus.com/inward/record.url?scp=85199274156&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85199274156&partnerID=8YFLogxK
U2 - 10.1145/3634737.3659433
DO - 10.1145/3634737.3659433
M3 - Conference contribution
AN - SCOPUS:85199274156
T3 - ACM AsiaCCS 2024 - Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
SP - 1949
EP - 1951
BT - ACM AsiaCCS 2024 - Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
PB - Association for Computing Machinery, Inc
Y2 - 1 July 2024 through 5 July 2024
ER -