A systematic approach to mitigating negative experiences online
Introduction
Millions of teens face preventable harm on social media on a daily basis. Internal Instagram research (the “Bad Experiences and Encounters Framework”) revealed that among 13- to 15-year-olds who used the platform in a week:
- 1 in 8 experienced unwanted sexual advances.
- 1 in 10 were targeted by bullying.
- 1 in 3 witnessed bullying.
- 1 in 5 received unwanted sexual content.
- 1 in 5 saw unwanted graphic violence.
- 1 in 5 felt worse about themselves.
Critically, Meta found that only 1% of those with a harmful experience submitted a report, and only 2% of those reports led to action. This means for every 10,000 harmed teens, only two received help through reporting—a 99.98% failure rate for the current system.
If you’re a teen, reflect on your experiences. If you’re a parent, talk to your kids. I’ve spoken with dozens of parents who have lost children or whose children were seriously affected by online harms like grooming, eating disorders, and bullying. These tragedies are almost always preventable.
Price of Social Media
We’ve come to accept this harm as the “price of admission” for social media. This is unacceptable. Parents deserve to know the true risks, and teens deserve effective tools to manage them.
Social media companies possess the technology and expertise to drastically reduce this harm while preserving connection. I led teams at Facebook that built machine-learning systems to evaluate billions of content pieces daily and created safety tools for teens. We published our results and proved it’s possible when safety is a true priority.
Recently, however, company leaders have shirked responsibility, investing in marketing that downplays the harm and creating flawed features that offer only an illusion of safety.
The root issue is an over-reliance on “objective” content moderation. While necessary, this approach is insufficient. In the real world, if a student reports harassment, the teacher doesn’t first determine if a specific “bad word” was used. The user’s experience is the ground truth of harm. We must build systems that recognize this.
The good news is that the engineering mindset needed already exists. We’ve successfully tackled spam and security. Online safety requires the same approach: capturing the right data and creating effective feedback loops. The key difference is that safety engineering must treat user feedback as an essential component of the system.
Harm needs to be understood from the perspective of the person experiencing it.
To reduce harm, we must first see it through the eyes of the teen. Testing with teen avatars quickly reveals concrete examples:
-
Unwanted Sexual Advances: Comments or DMs asking for sex, sexual GIFs, or unsolicited intimate pictures.
-
Bullying: Coded comments, contextual put-downs, and variations on slurs that are profoundly distressing to the target.
-
Graphic Violence: Videos of severe accidents (e.g., broken bones) without visible gore, circumventing moderation policies.
-
Eating Disorder Content: Images like a plate with a single piece of broccoli, or manipulated body shapes, recommended by the thousands.
-
Self-Harm Content: Black-and-white images with poems about worthlessness, delivered in volume.
-
Sexual Content Without Nudity: Graphic descriptions of demeaning sex acts using cartoons or enthusiastic storytelling.
This content is often designed to bypass moderation. A key danger is volume: a single piece of content may seem benign, but a personalized feed of thousands can be devastating.
A critical misconception is that only “vulnerable” kids are affected. All teens experience moments of vulnerability. The current algorithmic landscape, which rewards engagement without distinguishing between positive and negative impact, can exploit these moments for any child. Harm is amplified online due to persistence, distribution, and scale, making it far more damaging than its offline equivalents.
Different Kinds of Harm, and Strategies to Understand and Reduce It
Harm on social media generally falls into three categories:
-
Inappropriate Conduct and Contact: Unwanted sexual advances, bullying, hateful attacks based on identity. This occurs in messages and comments.
-
Exposure to “Sensitive Content”: Sexual, self-harm, violent, or hateful content, and the “rabbit holes” this leads to.
-
Harmful Usage (Addiction): When time on-platform increases anxiety, depressive behaviors, or social isolation.
Relying solely on content enforcement creates a dual risk: over-enforcement (removing acceptable content) and under-enforcement (allowing harmful content). Most intense harm does not objectively violate policy. Sextortion often uses compliments; severe bullying can look positive to an outsider. Harm is contextual and about volume.
Therefore, most teen harm cannot be solved by more moderation alone. The opportunity for innovation lies in understanding the combination of contact/content, user experience, severity, and context.
Effective Reporting
Effective reporting is a well-implemented feedback system that is:
-
Easy and rewarding to use.
-
Able to capture what happened, where (context), and how bad it was (intensity).
-
Designed to provide immediate support, independent of content moderation.
We measure it by asking users:
-
Were you able to let us know what happened?
-
Were you able to let us know how bad it was?
-
Were we able to provide support?
When building reporting tools for teens at Facebook, we learned crucial lessons:
-
Language Matters: Teens avoid the word “Report.” Changing the prompt to “I don’t want to see this” doubled or tripled usage. Options must match their descriptions of harm (e.g., “Someone is spreading rumors about me” instead of “Bullying”).
-
Steps Can Be Helpful: Contrary to dogma, teens will complete multiple steps if each one has a clear purpose and benefit.
By applying these insights, reporting completion rates soared from 10% to 82%, and 60% of teens felt better after using the tools.
Developing Safe Products
A safety framework for teens has four elements:
-
Effective reporting to understand what happened, where, and how bad it was.
-
Using that information to help the victim and protect others.
-
Giving feedback to the person who initiated the harm.
-
Measuring and monitoring the program’s effectiveness.
To understand harm, you must ask the user. Dismissing reports as “noisy” often means the company fails to understand the user’s perspective. The harm is real; the response must be proportionate.
This leads to a “conduct-in-context” approach. Give teens an easy way to flag unwanted conduct in a specific context (like their DMs). The content of the message is less important than the fact the user found the contact inappropriate.
The response should not be immediate punishment. Instead, track behavioral patterns. If a user is flagged for unwanted advances multiple times, send a respectful, private nudge. At Facebook, these nudges changed behavior 50-75% of the time. This separates those who respond to feedback from persistent bad actors.
Benefits of this approach:
-
It addresses harm at the source by communicating social norms.
-
It identifies repeat offenders.
-
It creates high-quality data to train more effective AI classifiers.
Lessons Learned from Developing Tools for Teens
The key lessons from improving teen safety tools are:
-
Language Matters: Avoid “Report.” Use user-centric language.
-
Context Matters: An unwanted DM is different from a public comment.
-
Intensity Matters: Capturing severity is essential for providing appropriate, personalized support.
By combining issue, context, and intensity, we can provide personalized interventions. This drove the massive increase in tool effectiveness and user well-being.
Creating Effective Safety Tools
An effective safety tool must have four features:
-
Prevention: It stops harm from occurring.
-
Resiliency: It is resistant to manipulation or workarounds.
-
Protection for Others: It captures data to prevent others from being harmed.
-
Ease of Use: It is on by default or requires one click.
A good tool captures a “behavioral correlate“—the combination of issue, intensity, context, and content. This data can be used to identify harmful patterns, train better algorithms, and measure the success of interventions.
Example: Unwanted Sexual Advances in Direct Messaging
A rapid, effective solution for unwanted DMs could include:
-
A prominent “help” button in early message threads.
-
Simple options to report: “It’s gross,” “They’re fake,” “They’re harassing me.”
-
Immediate, rewarding feedback: “We’ve blocked them,” plus resources like helplines.
-
Tracking how often a user initiates unwanted contact.
-
Sending a respectful nudge after multiple flags (e.g., “We’ve gotten feedback your messages were unwelcome. Please be respectful.”).
-
Escalating to feature limits if the behavior continues.
This approach is powerful because:
-
It doesn’t rely on message content, only on user experience.
-
It works with end-to-end encryption.
-
It generates data to identify predators and fake account networks.
-
It is inherently difficult to abuse.
Conclusion
The current safety model—focusing on objectively harmful content—is necessary but addresses only a fraction of the problem. We must pivot to a framework centered on the teen’s experience as the ground truth.
Safety tools must be as resilient and easy-to-use as security tools, gathering data that protects both the individual and the community. We must “red team” these tools to test their strength.
Broadening our scope unlocks immense innovation: using LLMs to understand user issues, training AI on behavioral data, and integrating human experience directly into system design.
We must start by asking: What rate of unwanted advances is acceptable? What would I build for my own children? The answers to these questions must be the foundation for everything we do.
Copyright 2025 held by owner/author. Publication rights licensed to ACM.
Contact us via email for more information.


