ActiveOS Updates: Username Detection, Enhanced Media Coverage, and More

By Emma Datny
May 16, 2024

A vibrant cover image for ActiveFence's May 2024 Product Updates. The image features a blue megaphone on the right, emitting rays of light, with the text 'Product Updates May 2024' in bold white and blue against a dark blue background.

Watch our on-demand demo series - Demo Tuesdays

Our product team is continuingly delivering new features and enhancements for ActiveOS and ActiveScore, including new AI models, enhanced media format coverage, and more.

Here are the new releases for this month:

New drug solicitation model: Identify and take action against illegal drug activities by automatically detecting violative communication that picks up euphemisms, slang, emoji, and code words that often elude standard detection methods.
New violative usernames models: Stop violative users from the first login by detecting usernames that reference hate speech, adult content, profanity, and drugs.
Enhanced, self-serve multilingual keyword management: Enable more accurate multilingual detection to ensure nothing is missed in undefined or mismatched languages.
Greater audio + video coverage with real-time transcription: Catches potential violations that might be missed in audio or video-only analyses with real-time analysis of the transcripts of these files.

Check out more details about each feature below:

Automatically identify and tackle illegal drug activities

Detecting and stopping communication related to drug solicitation is becoming more challenging as bad actors constantly find new ways to evade detection. Not to mention, the prevalence of illegal content poses legal risks.

To tackle this issue, our team has developed ActiveScore’s new drug solicitation contextual AI model. This is an additional signal to our other ActiveScore models to detect illegal drug activities, including violative usernames, keywords, images or text. This model has been trained on intelligence collected by our in-house domain experts. By analyzing the context of conversations and understanding the use of euphemisms, slang, emoji, and code words, it can automatically detect content that often sneaks undetected by standard detection methods.

You can access the model, like any of our other ActiveScore models, with one API integration. Each analyzed item will return a risk score from 0 to 100, indicating the likelihood of it containing drug solicitation. The results will also include associated indicators and descriptions.

The drug solicitation model can also be combined with our other models, such as drug images, violative usernames, and keywords, using the ActiveOS policy management tool. This allows you to customize your coverage based on relevant policies, addressing illegal drug activities from multiple angles.

Our base models are continuously improving daily thanks to feedback loops from moderator decisions. By retraining to your unique policies, drifts from the real-world, and up-to-date findings from our intelligence team, we are able to increase model accuracy over time.

For benchmark information, please see below:

Engaging in illegal activities such as drug solicitation can have serious legal ramifications. Current laws, like the US Senate’s Combating Cartels on Social Media Act of 2023 and the UK’s Online Safety Bill (2023), are already in effect and enforced. Non-compliance with these laws can result in legal fines.

To ensure compliance and mitigate the risks, you can use ActiveOS codeless workflows to enforce against illegal drug activities at scale. For example, you can build a workflow that automatically removes any item with a violative risk score over 70, and promptly send it to relevant authorities. Items with lower risk score can be routed to a high-priority queue for moderator review.

Stop violative users from the first login with username detection

Usernames are a representation of a user’s identity or brand. But while most select non-offensive names, a troubling minority deliberately choose offensive, abusive, or toxic usernames. They may employ evasive terminology to convey hate speech, illegal content, or profane language.

For this matter, we developed AI models specifically designed to detect violative usernames. It’s trained to overcome the unusual structure of usernames, which frequently combine L33Tspeak, misspellings, letters, symbols, numbers, and unique phrases that may seem benign out of context. With this model, clients can weed-out users with violative usernames, which often indicate involvement in illegal activities on the platform. This will help maintain a safe environment by stopping violative users at first touch.

The violative username models cover the following violations:

Drugs
Profanity
Adult Content
Hate Speech

Similar to all our ActiveScore models, we constantly improve accuracy over time through feedback loops, ensuring that they remain effective and up-to-date.

Here you can see our general benchmarks for more information:

On average, a small percentage of users, just 2-3%, are responsible for creating 40-50% of toxic content. With ActiveOS user-level views, you can easily see any flagged activity and take bulk actions for greater impact with fewer clicks.

Enable more accurate multilingual detection with keywords

Our keywords tool now offers more flexibility and can now detect variations in language usage. This includes a new option to search keywords in all languages, so you won’t miss any relevant content in undefined or mismatched languages.

We’ve added an option to search keywords by “all languages”, and not just by a specific language, so you can catch more variations that can be challenging to search for. For example, evasive tactics involving keywords that promote terrorism or drugs may not always be shown in their specific language. By enabling “all languages,” you can catch those hard-to-find keywords and take appropriate actions.

This feature is particularly useful in the following scenarios:

Undefined Languages: Even if a language is undetected, the keyword detectors can still identify relevant keywords.
Mismatched Languages: If there is no match between the listed language and the language of the item the keyword appears in, the tool can still detect those words by selecting “all languages.”

Below, you can see a sample of the detection changes made for exact, fuzzy, and partial matches, switching between a specified language and all languages:

Greater audio + video coverage with real-time transcription

By extracting and analyzing transcripts from audio or video files, you can now catch potential violations that might be missed in audio or video-only analyses. This approach offers greater accuracy and speed in detecting violations Moreover, this method recognizes the context and subtleties of language use, further enhancing accuracy.

This works the same as our current APIs, whereby you send the content, we transcribe the text for analysis, and provide a combined score. The risk score is determined by the maximum score received. For example, if the analyzed text yields a risk score of 80, but additional video frames only show a risk of 50, we will still report the combined score of 80.

Stay tuned, as we are continuing to work on many more exciting features and enhancements for ActiveOS.

If you’re interested in learning more or seeing these features in action, we invite you to our ongoing demo series – Demo Tuesdays. It’s a great opportunity to see the product in action, meet with our team, and ask any questions you may have! Alternatively, you can also schedule a 1-1 demo session with us.

Thanks,

The ActiveFence Team

Watch our on-demand demo series - Demo Tuesdays

Watch Now

ActiveOS Updates: Username Detection, Enhanced Media Coverage, and More

Automatically identify and tackle illegal drug activities

Stop violative users from the first login with username detection

Enable more accurate multilingual detection with keywords

Greater audio + video coverage with real-time transcription

Table of Contents

Related Content

Keeping Up with New Business Priorities: A Crash Course In GenAI Safety for T&S Professionals

ActiveOS Updates: Real-Time Actioning, Detecting Sexual Solicitation, and Stopping Extremism

Detecting Novel CSAM – Why Image Hash Matching Isn’t Enough Anymore