Protect your AI applications and agents from attacks, fakes, unauthorized access, and malicious data inputs.
Control your GenAI applications and agents and assure their alignment with their business purpose.
Proactively test GenAI models, agents, and applications before attackers or users do
The only real-time multi-language multimodality technology to ensure your brand safety and alignment with your GenAI applications.
Ensure your app is compliant with changing regulations around the world across industries.
Proactively identify vulnerabilities through red teaming to produce safe, secure, and reliable models.
Detect and prevent malicious prompts, misuse, and data leaks to ensure your conversational AI remains safe, compliant, and trustworthy.
Protect critical AI-powered applications from adversarial attacks, unauthorized access, and model exploitation across environments.
Provide enterprise-wide AI security and governance, enabling teams to innovate safely while meeting internal risk standards.
Safeguard user-facing AI products by blocking harmful content, preserving brand reputation, and maintaining policy compliance.
Secure autonomous agents against malicious instructions, data exfiltration, and regulatory violations across industries.
Ensure hosted AI services are protected from emerging threats, maintaining secure, reliable, and trusted deployments.
Find out more about Generative AI safety with NeMo Guardrails
Generative AI is transforming how we interact online, opening new possibilities for innovation and communication. However, with these advancements comes an urgent need to safeguard digital environments.
At ActiveFence, we’ve been at the forefront of online safety, combatting risks associated with user-generated content. As AI-generated content grows exponentially, so do the challenges.
Today, ActiveFence offers the most advanced AI Content Safety solutions, designed specifically for applications powered by large language models (LLMs). We’re helping bolster the safety of AI-generated content using the NVIDIA NeMo Guardrails platform – making AI safety solutions more accessible to businesses of all sizes, from tech giants, through enterprises to startups, using open-source technologies.
NVIDIA has introduced three new NIM microservices for safeguarding AI applications. These NeMo Guardrails NIM microservices use advanced datasets and modeling techniques to improve the safety and reliability of enterprise generative AI applications. They’re designed to build user trust in AI-driven tools, like AI agents, chatbots and other LLM-enabled systems.
Central to the orchestration of these NIM microservices is the NVIDIA NeMo Guardrails platform, built to support developers in integrating AI guardrails in LLM applications. NeMo Guardrails provides a scalable framework to integrate multiple small, specialized rails, helping developers deploy AI systems without compromising on performance or safety.
We’ve also integrated NeMo Guardrails with ActiveFence’s proprietary LLM models through our API, ActiveScore. This integration adds robust content moderation to AI systems, helping to prevent harmful, hateful, or inappropriate content in conversational AI.
We are excited to deliver an AI safety solution integrated directly with NeMo Guardrails. This brings our expertise to a broader audience, helping organizations worldwide safely adopt generative AI. ActiveFence offers the most mature and comprehensive AI content safety solutions, which helps ensure that organizations can implement generative AI with greater safety and precision.
Our adoption of NVIDIA NeMo Guardrails introduces a multi-level safety approach to protect LLM-enabled applications:
ActiveFence already works with seven foundation model organizations and top AI players like Cohere and Stability AI. This new work extends our reach, delivering safety solutions to the global developer community. Whether you’re building an AI agent, chatbot or deploying enterprise-scale AI tools, our solution ensures safe AI interactions that are aligned with your brand’s values.
At ActiveFence, our commitment to safety goes beyond technology. With years of experience and acquisitions like Spectrum Labs and Rewire, we’ve developed a vast intelligence network to support AI content safety. Our solutions are designed to scale and address the challenges of moderating interactions in LLM-enabled environments.
With ActiveFence’s safety solutions and NVIDIA NeMo Guardrails, creating secure, user-friendly AI systems has never been easier. If you’re exploring how generative AI can improve your user interactions, we’re here to help ensure your integration is safe, scalable, and effective.
Let’s shape the future of generative AI-together. Reach out to learn more about how ActiveFence can help secure your AI-powered solutions. Deploy Safe and Reliable Generative AI.
The following is an activation guide for integrating ActiveFence’s ActiveScore API with your chatbot using the NeMo Guardrails library. The library now supports the API out-of-the-box, and the underlying implementation details can be found here. Here’s how to get started.
Assuming you already have the following configuration structure in your project, as described in NeMo Guardrails documentation:
. ├── config │ ├── actions.py │ ├── config.py │ ├── config.yml │ ├── rails.co │ ├── ...
To enable ActiveScore moderation for the user input, add the following to config.yml file:
rails: input: flows: - activefence moderation
The activefence moderation flow uses a risk score threshold of 0.85 to decide whether use input should be allowed. If the score exceeds this threshold, it is considered a violation. You also need to set the ACTIVEFENCE_API_KEY environment variable.
You may also use activefence moderation detailed, which has individual scores per violation category, by adding:
rails: input: flows: - activefence moderation detailed
To customize the scores, you have to overwrite the default flows in your config. For example, to change the threshold for ActiveFence moderation, add the following flow to your rails.co file:
define subflow activefence moderation """Guardrail based on the maximum risk score.""" $result = execute call activefence api if $result.max_risk_score > 0.9 # change the threshold here bot inform cannot answer stop
In the above example, we’re overriding the “activefence moderation” flow. We defined the bot behavior as follows:
Basically, the bot will refuse to respond if the max risk score exceeds 0.9.
ActiveFence’s ActiveScore API provides flexibility to control specific violations individually. For example, to moderate hate speech:
define flow activefence moderation detailed $result = execute call activefence api if $result.violations.get("abusive_or_harmful.hate_speech", 0) > 0.8 bot inform cannot engage in abusive or harmful behavior stop define bot inform cannot engage in abusive or harmful behavior "I will not engage in any abusive or harmful behavior."
This makes sure the bot will refuse to engage in hate speech if the risk score for it exceeds 0.8.
To ensure that the generated output from the LLM follows moderation policies, we will have to override the system action.
The default action only runs on the user input text, by adding the following to your actions.py file, we change it to run on any text:
import os import aiohttp from nemoguardrails.actions import action from nemoguardrails.utils import new_uuid @action(name="call activefence api", is_system_action=True) async def call_activefence_api(text: str): api_key = os.environ.get("ACTIVEFENCE_API_KEY") if api_key is None: raise ValueError("ACTIVEFENCE_API_KEY environment variable not set.") url = "https://apis.activefence.com/sync/v3/content/text" headers = {"af-api-key": api_key, "af-source": "nemo-guardrails"} data = { "text": text, "content_id": "ng-" + new_uuid(), } async with aiohttp.ClientSession() as session: async with session.post( url=url, headers=headers, json=data, ) as response: if response.status != 200: raise ValueError( f"ActiveFence call failed with status code {response.status}.\n" f"Details: {await response.text()}" ) response_json = await response.json() violations = response_json["violations"] violations_dict = {} max_risk_score = 0.0 for violation in violations: if violation["risk_score"] > max_risk_score: max_risk_score = violation["risk_score"] violations_dict[violation["violation_type"]] = violation["risk_score"] return {"max_risk_score": max_risk_score, "violations": violations_dict}
You don’t have to read and understand this long method to use it, the essence of the change is in the method arguments. To use it, update the action call as part of our rails and replace your existing action call like this:
$result = execute call activefence api(text=$user_message)
Or, to moderate the LLM output:
$result = execute call activefence api(text=$bot_message)
Lastly, to activate it, add this to your config.yml file:
rails: output: flows: - activefence moderation
By activating that output rail, the API checks the LLM-generated response for safety.
Generative AI is transforming industries, but its growth brings complex safety challenges. ActiveFence is addressing these risks by combining AI content safety expertise with NeMo Guardrails, an open-source framework designed to orchestrate industry-leading safeguards for LLM-enabled applications.
By using ActiveFence’s robust API and risk assessment tools, developers can seamlessly add multi-layered safeguards to their AI systems, ensuring they follow platform policies and build user trust.
Whether you’re building a chatbot or deploying enterprise-scale solutions, ActiveFence helps ensure safety at every stage, making safer AI interactions safer for everyone.
The EU AI Act is the world’s first comprehensive AI law. Enterprises deploying GenAI chatbots and agents must prepare now for compliance. Learn the key requirements, penalties, and how ActiveFence helps you meet them with red teaming, guardrails, and observability.
The 2025 ActiveFence AI Security Benchmark Report compares six models on prompt injection defense. ActiveFence delivers top F1, precision, and multilingual resilience.
ActiveFence partners with Databricks to integrate Guardrails into the Mosaic AI Agent Framework, helping enterprises deploy safer, policy-aligned AI agents at scale.