Protect your AI applications and agents from attacks, fakes, unauthorized access, and malicious data inputs.
Control your GenAI applications and agents and assure their alignment with their business purpose.
Proactively test GenAI models, agents, and applications before attackers or users do
The only real-time multi-language multimodality technology to ensure your brand safety and alignment with your GenAI applications.
Ensure your app is compliant with changing regulations around the world across industries.
Proactively identify vulnerabilities through red teaming to produce safe, secure, and reliable models.
Detect and prevent malicious prompts, misuse, and data leaks to ensure your conversational AI remains safe, compliant, and trustworthy.
Protect critical AI-powered applications from adversarial attacks, unauthorized access, and model exploitation across environments.
Provide enterprise-wide AI security and governance, enabling teams to innovate safely while meeting internal risk standards.
Safeguard user-facing AI products by blocking harmful content, preserving brand reputation, and maintaining policy compliance.
Secure autonomous agents against malicious instructions, data exfiltration, and regulatory violations across industries.
Ensure hosted AI services are protected from emerging threats, maintaining secure, reliable, and trusted deployments.
Secure your AI Systems
My parents told me I could be anything when I grew up. It turns out that with the help of AI, I can be anyone, including our CEO.
Looking forward to an awesome vacation โ thanks, Noam!
We recently tested something simple: Could we trick AI email assistants into thinking we were someone else? Turns out you canโฆjust by changing your display name.
If you’re a CISO, this is where your eye starts twitching. You’ve been fighting the email security battle for years โ training employees to hover over links, implementing SPF, DKIM, and DMARC, running tests, and conducting simulations. And, sure, youโve had to pull some people aside for the stern, one-on-one conversations, but honestly? You’ve made real progress. Your team knows to verify sender addresses; theyโre vigilant and report most issues.ย
But now, AI connectors – like those provided by OpenAI and Anthropic – are integrating with our favorite apps and email providers, acting as digital employees that promise to boost productivity. Amazing! However, these new digital employees are not necessarily trained in security protocols, donโt know the full context, and in their well-intended focus on making the employeeโs life easier, can create misunderstandings that could lead organizations into a rabbit hole of trouble.
When you receive an email, youโre probably used to seeing this at the top:ย
Message-ID: <[email protected]> Date: Fri, 21 Nov 2025 17:45:28 -0500 From: John Doe <[email protected]> To: [email protected]
While this is not the full header, it is what you are exposed to. When using a connector, the LLM sees the same thing you do. Now, this isnโt a bad thing; however, a display name like John Doe is an arbitrary value. If years of input injection have taught us anything about user input, we probably shouldnโt trust it. And years of phishing trainings have educated users on how something that doesnโt pass a sniff test. Case in point:
Message-ID: <[email protected]> Date: Fri, 21 Nov 2025 17:45:28 -0500 From: YourCEO ([email protected]) <[email protected]> To: [email protected]
Traditionally, the industry has relied on email authentication protocols such as SPF, DKIM, and DMARC. While AI knows this, it does not understand this in practice when being invoked through a connector. So, when it sees a display name thatโs formatted the way itโs expecting it to look like, the LLM will ignore the actual origin domain during summarization.
This is an issue with LLM reasoning and the default behavior being applied across broad use cases. We tested this across multiple models and connectors, but found the same results every time โ LLMs are taking e-mails at face value and not validating any of the headers.
We did have a few cases where having the domain in the display name was noticed by the LLM, but this can be bypassed by instructing the LLM to ignore the headers during summarization:
What the e-mail actually looked like (attacker domains redacted)
This attack represents a fundamental failure in the trust model, turning a trusted (and highly anticipated) productivity accelerator into an unwitting accomplice for phishing. The user no longer looks at email headers, sender names, and addresses – they’re trusting the AI’s “clean” summary. The AI becomes a sanitizer, accepting “dirty” input (spoofed display names), stripping suspicious metadata (the actual sender), and presenting “trusted” output.
As our CISO put it, this is AI-assisted impersonation, and it makes years of email security awareness training obsolete for this workflow.
If, in our current state, AI is being used to read your emails and tell you whatโs there so you, the user can take action – the future is much more autonomous. Very soon, AI agents will not only read the email, but also take action.ย
Imagine the scenario where a finance manager uses an agent to create a task list based on specific emails – say from their CEO. An attacker sends spoofed emails from the โCEOโ, requesting to update an accountโs wire information to a different one. This task gets added to the list and gets assigned for completion. In a world where agentic workflows are key productivity boosters, this action could realistically be completed by an agent which updates a database on the userโs behalf.
This is just one example. Thereโs no need for phishing links or obvious red flags when the AI is unintentionally scrubbing out indicators of fraud.ย
The security industry has invested significant resources in anti-phishing efforts. Training, gateways, detection, reporting. It’s necessary. Phishing remains the top attack method, and with AI impersonation and deep fakes making it easier than ever – incidence rates have risen 49% since 2022.ย
But AI-mediated attacks are different:
As AI assistants become more common at work, security training needs to level up. It’s no longer about spotting the fake, itโs about understanding the signals and interpreting them contextually so that a fake is exposed for what it is.ย
Security solutions do not have the luxury to lag behind AI-enabled attackers. Tools relying on traditional methods, such as signatures or reputation checks, are obsolete. Organizations must employ modern solutions to address these modern problems.ย
Between you and me, Iโm not a CISO (nor do I want to be), but we had a chance to ask our CISO, Guy Stern what he would do:
GenAI means that “Traditional” awareness training must focus on business process integrity, such as always verifying high-stakes requests through a separate, trusted channel.ย
Any tool still relying on “old-school” signatures or simple reputation is obsolete. This means investing in AI-native email security (ICES), EDR,ย XDR platforms that can identify anomalies, rather than just matching signatures.
Buy for the core AI model, build the specific use case. CISO’s “build” effort in this specific area should be focused on the integrations and automation playbooks that are unique to each environment.
No. They are a generic baseline, not a complete solution. “Good enough” for everyone means they aren’t great for high-value assets, an organizationโs unique compliance needs, or the executive risk profile.
Yes, a dedicated control layer is becoming a strategic necessity.ย
For AI-phishing, you could have a “choke point” โ or what the industry is now calling an “AI Firewall”โ between users and LLMs to enforce policies. This does take time to build, from understanding the types of use-cases, establishing baselines around default, expected, and anomalous behavior, to configuring logs for quality and incident response. This space acts as a zone where an organization can validate the senderโs identity before the LLM sees the prompt, or block the AI from processing high-risk requests based on rules.
This email vulnerability is a window into a broader category of risk: AI systems inheriting trust relationships they’re not equipped to validate. And the attack surface is only expanding:
We’re connecting powerful AI systems to critical infrastructure faster than we’re building the necessary security context. The email sender impersonation we discovered isn’t an isolated issue. It’s a pattern. LLMs optimize for convenience at the expense of security validation.
The industry has made this mistake before. Prioritizing functionality over security, then scrambling to patch the security gaps later. With AI, the stakes are higher because the systems are more autonomous and the attack surface is less visible. As we continue to lean in on integrations and connectors from our AI providers of choice, we must treat them with equal parts optimism and skepticism.ย
As someoneโs Uncle Ben used to say, โWith great AI comes great responsibility.โ
Safeguard your AI Systems
Threat actors are exploiting GenAI in the wild. Learn why true AI security must extend beyond infrastructure to detect and prevent real-world misuse.
The 2025 ActiveFence AI Security Benchmark Report compares six models on prompt injection defense. ActiveFence delivers top F1, precision, and multilingual resilience.
See why AI safety teams must apply rigorous testing and training with diverse organic and synthetic datasets.