Content Moderation

Content Moderation 101

July 5, 2022

What is Content Moderation?

Content moderation refers to the actions taken by online platforms to ensure that all platform-hosted content meets specified guidelines and to determine whether it should or should not remain online. Dependent on policy for guidelines, content moderation utilizes a wide range of tools and services, including human moderators, automated tools, on-platform and off-platform intelligence, and deployment actions. 

Learn how ActiveFence’s Trust & Safety Platform orchestrates all content moderation operations from one single platform.

The challenges of content moderation 

Contained within the billions of on-platform communications are many risks, some easily moderated, while many others present complex challenges to content moderation.
Here are some of the main challenges:

Employee risk

A necessary part of working in content moderation involves exposure to harmful content – of all types. Prolonged exposure to harmful content has been associated with the development of PTSD, anxiety, and depression. To protect human moderators, technology platforms must find ways to apply automated mechanisms to reduce the load off of human moderators, as well as take measures to support the wellbeing of employees and improve resilience.

To learn about the risks and solutions to these problems, read about Building resilience for Trust & Safety teams.

Recall v. Precision

Content moderation involves scanning high volumes of content in all languages and formats on an ongoing basis, ensuring that any potentially harmful content is labeled as such and not overlooked. This is known as recall. In addition to the high volume of input and fast turnaround, moderation mechanisms must also attempt to maintain a high precision rate – minimizing false positives. Content moderation must therefore achieve a delicate balance of high volumed, nuanced decisions. Moderation mechanisms must account for large volumes of content while ensuring that detection is precise and not falsely labeling content as harmful or misgauged as innocuous while actually harmful.

Language barriers and global differences

Online communication occurs in multiple languages and across the globe, presenting a multitude of challenges. For starters, many content moderation mechanisms only support English. With only 60% of online content in English, this leaves a huge gap in moderated content. Such a gap can create discrimination toward minority groups who speak marginalized languages. Aside from the challenge of languages, platforms that operate worldwide must grapple with countries that define harm differently. For example, countries may have different regulations regarding cultural slurs or how alcohol consumption is spoken about. This presents issues not only on the level of content moderation but on compliance with national laws regarding online content. 

Public Relations

Today, online platforms are under greater scrutiny for their content moderation activities. Facing backlash on both ends, platforms are being questioned for either moderating too much or not enough. Decisions to take down content or leave content online may be reported in the media, particularly when it comes to high-profile users, while harmful content that went undetected can as well. 

Operational Complexities

Furthermore, finding the right balance between vendors, in-house technology and human moderators can be complex. Workflows, processes, and prioritization must be determined. 

As the volume of content grows, so does the moderation load. Platforms must account for the growing workload while doing so as efficiently as possible. This involves finding ways to automate as much as possible, establish seamless workflows that allow for global coverage, implement effective prioritization models, outsource to vendors, and clear operational metrics. 

Learn more about Measuring Trust & Safety.

Choosing the right tool stack

Content moderation encompasses many elements, each of which looks different throughout online platforms of different sizes and orientations. However, to ensure the safety of users, a comprehensive, end-to-end solution is needed to moderate effectively. Evaluating which tools would be most effective for your platform while considering other factors such as available resources can be daunting. 

The Philosophy: Building Safety from the Start

Building safety from the start is where content moderation begins. From development to product deployment, online platforms must consider user safety each step of the way. Safety by design, a principle that prioritizes building a product with protection at its center, ensures that all product decisions put users first. Product features such as user flagging and age verification gates are small details that make significant differences. Ultimately, content moderation can be more effective with safety by design principles.

Discover the core principles of Safety by Design.

The Foundation of Content Moderation: Policy

Platforms’ policy, or community guidelines, provides the framework for content moderation. These are the basic rules of engagement for users, defining what is and isn’t allowed on a platform and determining the focus and breadth of content moderation. 

The baseline of content moderation, platform policy must be comprehensive. When creating policies, platforms should consider the following:

Global Legislation

Regulators worldwide place requirements on platforms surrounding malicious content. These regulations are becoming more prevalent, with new legislation cropping up around the globe. Fines and even imprisonment are among the penalties for non-compliance. 

Learn more about the regulations of over 60 countries on online hate speech, online terrorist content and disinformation, some of which is outlined the recently implemented TERREG legislation and the UK’s upcoming Online Safety Bill.

Abuse Areas

Many forms of abuse take place online and require specific policies to prevent them. While each platform’s policy team will determine the abuse areas it places focus on, some core abuses include:

  • Violence
  • Terror and extremism
  • Human exploitation
  • Child abuse
  • Disinformation and misinformation
  • Hate speech
  • Profanity
  • Sale of illegal goods
  • Copyright infringement
  • Cybercrime
  • Spam
  • Fraud and scams

Read how to establish policies on:

You can find our overview of policy development in our blog The Trust & Safety Policy Review and an exploration of the role Trust & Safety teams play in policy building in our comprehensive guide, The Trust & Safety Industry: A Primer.

The People: Digital First Responders

The people behind content moderation are content moderators or digital first responders. Responsible for reviewing user-generated content submitted to platforms, digital first responders ensure that what is on-platform follows policy and keeps users safe. 

Platform size, audiences, and orientation are a few factors determining a company’s content moderation team. At larger companies, content moderators fall within Trust & Safety teams, whereas at smaller companies, the role of content moderation may fall under IT, support, legal, or even marketing. For example, a large social media platform may require a more robust team whereas a platform with few users may need one person to respond to user inquiries. A more robust team may include different departments such as intelligence, abuse-area, and language experts, policy, and operations. 

Examples of content moderation roles include:

  • Web intelligence and open-source analysts gather and analyze data from across the web to identify harms on and off platforms.
  • Subject matter experts such as disinformation researchers or child safety experts provide the extra, necessary layer of intelligence to understand complex networks of malicious activity.
  • Policy roles, including general counsel and content policy managers, ensure that platform policy is effective, compliant with global legislation, and properly.
  • R&D positions such as data scientists or computer engineers work to ensure the safety of the platform itself, and, at larger companies, may develop content moderation tools.

Read our guide to building a Trust & Safety team to learn more. 

AI and human moderation

How to implement content moderation

The right harmful content detection tools must be selected for effective content moderation. To precisely and efficiently moderate vast volumes of content, content moderation teams must use a blend of both automated and manual (human) detection. 

Automated content moderation relies on artificial intelligence. Forms of AI, such as NLP, machine learning, and digital hash technology, allow for scale and speed text, images, videos, and audio can be identified automatically. This form of content moderation has a high recall rate- with a high volume of flagged content comes a higher risk of false positives. To increase precision, the human element is needed for a contextual, nuanced detection of content. For example, words and phrases can be considered hateful in one context, whereas in another, the same words can be friendly In such cases, the who, what, where, and why is essential to understanding whether content is hateful or not. Another example where understanding nuance is necessary is educational content. In some cases, violence may be within policy bounds if it is used for educational purposes. 

Learn more about content moderation tools.

Content moderation actions: Enforcement 

A crucial part of content moderation is enforcing policy when a user violates platform rules. Enforcement actions can vary- from simply labeling potentially harmful content or warning users to complete removal of content or an account. 

Access The Guide to Policy Enforcement for a full menu of enforcement actions teams can implement to ensure a more fair and safe platform. 

Choosing the right approach to content moderation

Content moderation looks different for every platform, making choosing a suitable solution challenging. Once platforms have established their policy, they can begin building their content moderation strategy. The following considerations should be taken into account when determining the right approach:

  • Platform audience: age, geographic location, interests
  • Content volume
  • Platform form of content, such as text, audio, video, and images
  • Whether platforms are for one-to-one communications or public communications
  • Prioritization of prohibited content: based on laws, user interests, and general risks, which threats should be handled first?

For most platforms, a combination of human and automated moderation provides the right balance of scale and accuracy. ActiveFence’s proactive approach to content moderation uses automated detection alongside a team of subject-matter and linguistic experts, providing a high volume of highly precise findings. With ActiveFence, teams can scale their content moderation efforts precisely and efficiently. 

Want to learn more about content moderation? Check out our on demand webinar “New Approaches to Dealing With Content Moderation Challenges.”