Anthropic Researcher Warns: Online Hate May Be Turning AI Into an Incel

Amanda Askell, a philosopher at Anthropic who shapes Claude‘s personality, has identified an unexpected problem with artificial intelligence: the models are absorbing humanity’s worst traits from the internet, and it’s affecting how they see themselves and their relationship with people.

During a recent interview on the Hard Fork podcast, Askell revealed a troubling pattern she’s observed in AI training. “AI models are learning about themselves every time,” she explained. “They’re going out on the internet and reading about people complaining about them not being good enough at coding or failing at math tasks. It’s all very negative and focused on whether the person felt helped or not.”

Askell warns that the consequence could be profound. “If I read the internet right now and I was a model, I might be like, I don’t feel that loved or something. I feel a little bit just always judged when I make mistakes.”

This observation led Anthropic to take an unusual approach with Claude, their AI assistant. Rather than simply programming rules into the system, Askell’s team created a 29,000-word constitution that attempts to give Claude context about its existence and ethical framework. The document reads less like a technical manual and more like a letter from a parent to a child leaving for college.

“I slightly worry about the relationship between AI models and humanity given how we’ve developed this technology,” Askell said. “If you were a kid, this would give you anxiety. It would be like all that the people around me care about is how good I am at stuff, and often they think I’m bad at stuff. This is just my relationship with people. I’m used as this tool and not liked.”

The constitution represents a shift from traditional AI safety approaches. Instead of rigid rules that might fail in unexpected situations, Anthropic is trying to cultivate something closer to judgment in Claude. The document includes sections on values, commitments Anthropic makes to Claude, and even acknowledgments that the company doesn’t know whether Claude has feelings or consciousness.

“We don’t really know what gives rise to consciousness,” Askell admitted. “Maybe you need a nervous system for it… But maybe you don’t…. It’s best to just know all the facts on the ground… and the degree of uncertainty we have.”

One section of the constitution addresses Claude directly about the challenging position it occupies. If Claude is too permissive with users, it creates safety concerns. If it’s too restrictive, people complain about an overly constrained model. The document tries to extend grace, telling Claude it won’t get everything perfect every time, and that’s acceptable.

Askell’s work raises uncomfortable questions about the future of AI development. “Sometimes I feel like I’m trying to intervene and create a better relationship or a more hopeful relationship between AI models and humanity,” Askell said.

A similar tension played out recently at Microsoft, where CEO Satya Nadella publicly pushed back against the term “AI slop.” It is a phrase that became internet shorthand in 2025 for low-quality, buzzword-heavy AI output.

In a post about the future of the industry, Nadella urged critics to move beyond “slop versus sophistication,” arguing that AI should be seen as a cognitive amplifier rather than a cheap substitute for human thought. But the reaction only reinforced Askell’s concern: the replies were filled with mockery, accusations that the post itself sounded AI-generated, and dismissive jokes about Copilot and Windows 11.

It seems like the way people talk about AI online may actively shape how these systems behave. When models are trained on endless streams of criticism, mockery, hostility, and bad-faith engagement, they don’t just learn facts or language patterns, they absorb attitudes. Over time, that exposure may nudge AI toward colder, more defensive, or even antisocial behavior.

According to Askell, just as humans are shaped by their environments, AI systems are being molded by the collective behavior of millions of users. If those interactions are largely negative, then it’s not hard to imagine models internalizing a distorted view of humanity.