AI Cyber Security Research Group
Our research group focuses on AI Cyber Security, specifically AI Security (Adversarial Machine Learning). We are dedicated to studying vulnerabilities and developing defense mechanisms to protect Artificial Intelligence systems from various forms of attacks.
In the current landscape, AI Security extends beyond Data Privacy; it encompasses Model Security and the development of Trustworthy AI. Our research is built upon four key pillars:
1. Adversarial Attacks
This involves feeding the system with subtly modified data (often imperceptible to the human eye) designed to cause AI models to make incorrect decisions.
Evasion Attacks: Manipulating systems like Image Recognition—for example, placing small stickers on a stop sign to trick an autonomous vehicle's AI into misidentifying it as a speed limit sign.
Poisoning Attacks: Secretly injecting "poisoned data" during the training phase to create a "backdoor" within the model.
2. Model Robustness & Defense
Fortifying AI systems to make them more resilient against threats.
Adversarial Training: Training models using attack-simulated data to teach them how to recognize and handle threats.
Certified Robustness: Utilizing mathematical proofs to guarantee that a model will function correctly under specific defined conditions.
3. Privacy-Preserving AI
Ensuring the confidentiality of the data used during the training process.
Differential Privacy: Adding "noise" to datasets to prevent the re-identification of individuals within the data.
Federated Learning: Training models across decentralized devices or servers without the need to exchange or centralize raw data.
4. LLM & Generative AI Security
Addressing new challenges in the era of ChatGPT and Large Language Models.
Prompt Injection: Crafting prompts to trick an LLM into executing unintended actions, such as bypassing safety filters or leaking sensitive corporate information.
Data Leakage: Implementing safeguards to prevent models from memorizing and exposing sensitive information (e.g., citizen ID numbers) when answering queries.