ActiveFence and NVIDIA Ensure Safe GenAI Solutions

By
March 21, 2024

Learn how ActiveFence ensures foundation model safety

AI Safety

With generative AI revolutionizing how we interact online, ensuring the safety of digital environments is now a top priority for everyone.

ActiveFence has been leading the charge in creating safer online spaces, mainly focusing on the risks that come from user-generated content. But as AI-generated technology becomes more common, we’re now facing a whole new set of challenges. The amount of harmful content being created has skyrocketed, and new types of risks are emerging almost every day.

That’s why we’re excited to join forces with NVIDIA to tackle the risks associated with AI-generated content head-on. Together, we’re making sure that our GenAI safety solutions are accessible to everyone, whether they’re top tech companies or small businesses relying on open-source technology.

Joining NeMo Guardrails

NVIDIA, known for its graphics processing units, is taking a proactive approach to Gen-AI safety with NeMo Guardrails.

This open-source project will keep smart applications powered by large language models (LLMs) in check. It includes all the code, examples, and documentation needed to keep text-producing GenAI from engaging in discussions about unwanted topics. So, all those chatbots everyone’s using will be kept safe, secure, and on-topic, and not spout any hateful, harmful, or dangerous rhetoric.

Recognizing ActiveFence’s expertise, NVIDIA has integrated our native content safety solution right into NeMo Guardrails, making us the first AI safety vendor to be integrated in guardrails. This means that the conversational chatbots with NeMo Guardrails added to their workflows will remain safe for users.

While we’re already working with some of the biggest GenAI companies, Like Cohere and Stability, among others, we’re very excited to launch a subset of that technology to the rest of the world. We are proud to see it featured in NeMo Guardrails for widespread use.

Achieving Safe GenAI Integrations

Building an AI safety policy using multiple models and keyword lists

If you’re considering leveraging LLM-enabled applications, ActiveFence and NVIDIA can help ensure a safe integration. In fact, we’ve developed a multi-level approach to enhance the safety of GenAI models.

These stages include:

  1. Allowing end-user reporting: Integrating ActiveFence’s Flagging API enables users to report negative experiences, similar to how social media platforms have reporting buttons. This serves as an essential first line of defense against harmful content.
  2. Input filtering: By using ActiveFence’s API, you can assess the risk score for every user prompt, ensuring it meets your platform’s guidelines.
  3. Output filtering: ActiveFence’s API provides a risk score for each GenAI model output, ensuring it aligns with your platform’s policies and standards.
  4. Incident visibility and management: ActiveOS allows you to review and make decisions regarding reported conversations or inputs/outputs that don’t comply with your platform’s policies.
  5. Automation of decision-making: To streamline operations, ActiveOS offers automated workflows. With these workflows, platforms can customize their enforcement, like implementing “three strikes” policies for repeat offenders. Additionally, customized policies based on user location or subscription status can be implemented through these automated workflows.

ActiveFence is dedicated to making GenAI safer for all users, regardless of company size. By integrating our AI safety solution into NVIDIA’s NeMo Guardrails, together we offer a comprehensive solution to ensure safe interactions with GenAI.

Now you, too, can create a conversational chatbot and utilize NVIDIA’s NeMo Guardrails with ActiveFence to ensure it’s safe for your users and your brand.

If you’re interested in using GenAI to communicate with your users, we can help ensure that integration is safe.

 

Table of Contents

Learn more about ActiveFence's AI Safety solution

AI Safety