Friday, December 6, 2024 11:30am
About this Event
Brown Lab, Newark
http://cis.udel.eduSecurity Risks of LLM Systems
ABSTRACT
Large language models (LLMs) have achieved remarkable success due to their exceptional generative capabilities. Despite their success, they also face security and safety challenges when deployed in many real-world applications.
In this talk, I will discuss potential attacks to LLM systems such as prompt injection attacks and knowledge corruption attacks, where an attacker can inject malicious instructions/texts to induce an LLM to generate attacker-desired outputs. For instance, in retrieval-augmented generation (RAG) systems, knowledge databases can provide up-to-date and domain-specific knowledge that enhances the output of an LLM. However, knowledge databases also introduce a new attack surface. We show an attacker can inject malicious texts into the knowledge database of a RAG system to make an LLM generate attacker-chosen answers to attacker-chosen questions. Our results show a few texts can make the attack successful for a knowledge database with millions of texts. Finally, I will discuss attribution-based defenses against attacks. For instance, in our ongoing work, we develop a watermarking method to embed a watermark into LLM-generated texts, which can detect watermarked texts and attribute users who generated texts.
BIOGRAPHY
Dr. Jinyuan Jia is an assistant professor at the College of Information Sciences and Technology at Penn State. Before joining Penn State, he was a postdoc at the University of Illinois Urbana-Champaign. He received a Ph.D. in Electrical and Computer Engineering from Duke University in 2022. His current research interests include the security and safety of LLM systems. He has received several awards such as CCS Distinguished Paper Award, DeepMind Best Extended Abstract Award, and NDSS Distinguished Paper Award Honorable Mention.
User Activity
No recent activity