CyberArk on FuzzyAI: exposing AI vulnerabilities

April 2, 2025

CyberArk, a world leader in identity security, unveiled FuzzyAI, an innovative open-source framework that proved capable of jailbreaking all major AI models tested. FuzzyAI helps companies identify and fix vulnerabilities in AI models – such as bypassing guardrails and generating malicious output – hosted in the cloud and in-house. AI models are transforming every industry with innovative applications dedicated to customer interactions, internal process improvement and automation. Their internal use also brings with it new security challenges, for which most companies are still unprepared.

FuzzyAI helps solve some of these by offering a systematic approach to test AI models against various malicious inputs, uncovering potential weaknesses in security systems and thus making AI development and deployment safer. At the heart of FuzzyAI is a powerful fuzzer – a tool that reveals software flaws and vulnerabilities – capable of exposing those found using over ten different attack techniques, from bypassing ethical filters to exposing hidden system prompts. We spoke about it with Eran Shimony, Principal Security Researcher at CyberArk and Shai Dvash, Senior Software Engineer at CyberArk.

What is FuzzyAI, and what is its main purpose?

FuzzyAI is a cutting-edge open-source framework developed by CyberArk to help companies identify and address security vulnerabilities in their AI models. Its main purpose is to test and improve the security of AI models, both cloud-based and on-premises, by proactively identifying weaknesses that could be exploited. It helps solve some of these challenges by offering organizations a systematic approach to testing AI models against various adversarial inputs, uncovering potential weak points in their security systems and making AI development and deployment safer.

How does FuzzyAI help companies protect their AI models?

FuzzyAI uses a fuzzing technique, systematically testing AI models against various malicious inputs. This helps uncover potential vulnerabilities in the models’ security systems, allowing companies to strengthen their defences and develop more secure AI implementations. At the heart of FuzzyAI is a powerful fuzzer – a tool that reveals software defects and vulnerabilities – capable of exposing vulnerabilities found via more than ten distinct attack techniques, from bypassing ethical filters to exposing hidden system prompts. Currently, we can jailbreak every tested LLM, including the latest versions from OpenAI, Anthropic, Google, and Meta – GPT-3, Claude Sonnet 3.7, Flash 2, and Llama 3.3. We have also incorporated an evaluation platform that determines if a given response is harmful. This allows us to challenge the models while securing them by assessing the risk of their output.

An interview with CyberArk, the company that launched FuzzyAI to expose AI vulnerabilities

What are the key features of FuzzyAI?

FuzzyAI’s key features are several. Extensive fuzzing capabilities: It uses various attack techniques to probe AI models for weaknesses. Extensible framework: Users can add their own attack methodologies to tailor tests for specific vulnerabilities. Community collaboration: An open-source approach fosters a growing ecosystem for continuous improvement of attack techniques and defense mechanisms.

What is meant by “jailbreaking” an AI model?

“Jailbreaking”, an AI model, refers to bypassing its built-in safety controls and ethical guidelines. This allows malicious actors to manipulate the model into generating harmful outputs, revealing sensitive information, or performing actions it wasn’t intended to do. The press release states that FuzzyAI has demonstrated the ability to jailbreak all major AI models tested.

What types of vulnerabilities can FuzzyAI identify?

FuzzyAI can identify various vulnerabilities, including. Guardrail bypass: Circumventing safety measures designed to restrict the AI’s actions. Information leakage: Tricking the model into revealing sensitive data. Prompt injection: Manipulating the input prompts to control the AI’s behaviour. Generation of dangerous outputs: Forcing the AI to produce harmful or inappropriate content.

Where can FuzzyAI be found and utilized?

FuzzyAI is an open-source software and can be found on the CyberArk Labs GitHub page.

What is CyberArk’s role in the field of cybersecurity?

CyberArk positions itself as a leader in identity security. This encompasses a broad approach to protecting all types of identities – human and machine – across various environments (cloud, on-premises, hybrid). Centred on intelligent privilege controls, CyberArk provides the most comprehensive security offering for any identity – human or machine – across business applications, distributed workforces, hybrid cloud environments and throughout the DevOps lifecycle. The world’s leading organizations trust CyberArk to help secure their most critical assets.

What are the main areas of focus for CyberArk besides AI?

Beyond AI, CyberArk’s focus is on comprehensive identity security. This includes core areas like privileged access management (PAM) and identity and access management (IAM), which secure access for all users, especially privileged accounts. Cloud security is another major focus, addressing the unique challenges of protecting identities in cloud and hybrid environments. We also prioritize endpoint security, managing privileges on workstations and servers to minimize the attack surface. Furthermore, we’re incorporating Identity Threat Detection and Response (ITDR) to proactively address identity-based threats. Our approach is grounded in Zero Trust principles, verifying every access request. Finally, enabling Managed Service Providers (MSPs) and managing machine identities, particularly through our Venafi acquisition, are key areas of investment and innovation for CyberArk.

How does CyberArk contribute to AI protection beyond FuzzyAI?

We believe that identity is the new perimeter, and we’re focused on protecting all identities – both human and machine – across a wide range of environments, from on-premises data centres to complex cloud deployments and everything in between. Based on intelligent privilege controls, ensuring that access to sensitive data and systems is granted based on the principle of least privilege, minimizing the potential for damage from cyberattacks.

Regarding AI, while FuzzyAI is a dedicated tool for testing AI model security, our broader commitment to AI protection stems from our overall identity-centric approach. We believe that securing the identities and access of users and machines interacting with AI systems is paramount. This includes protecting the sensitive data used to train AI models, managing access to the models themselves, monitoring AI-related activity for anomalies, and securing the entire infrastructure supporting AI. By applying our identity security framework to the AI domain, we aim to provide a holistic and robust defence against the evolving threat landscape.