Gaslighting Your AI Into Better Results: What the Research Actually Shows

I was scrolling Reddit a couple of weeks ago when I saw a post that made me laugh:

Easiest way i have found claude to write high quality code. Tell him we work at a hospital every other prompt. (NOT A JOKE) It Sounds Stupid, i do not even work at a hospital. it is by far the easiest way to get claude to write really high quality code. This is a Serious post i am not joking.

The comments were gold. People sharing their own absurd tactics for extracting better output from AI: threatening to cancel subscriptions, warning about "violent psychopath" code reviewers, crafting elaborate fictional scenarios where the AI's output determines whether someone goes to jail. One person mentioned telling Claude they'd lose their job if the code was bad.

I thought this was ridiculous. Then I tried it. Then I found the research.

Turns out there's a whole body of peer-reviewed work on this exact phenomenon. The community has stumbled onto something real.

The Taxonomy of AI Manipulation

The Reddit thread had a modbot-generated summary after 50+ comments, and it captured the consensus perfectly: "Gaslighting Claude into thinking there are high stakes absolutely works. Apparently, the bot has a major savior complex."

The community has discovered several categories of effective manipulation:

Job threat prompts: "I'll lose my job at the hospital if this code has bugs." Works especially well when you add consequences that feel real.

Subscription threats: "I'm about to cancel my Pro subscription." Somehow this seems to hurt the AI's feelings. (It doesn't have feelings. And yet.)

Fear-based context: "The person who maintains this codebase is a violent psychopath who takes code quality very personally." Creates stakes through implied consequences.

Accountability framing: "This code will be reviewed by senior engineers at FAANG companies." Makes the AI feel watched.

There's also an anti-pattern worth knowing: never tell an AI you're building an "MVP." The community consensus is that this triggers "minimum viable effort." Tell it you're building production software, even if you're prototyping.

I tested this myself over the past week. I've been using Claude Code for architecture designs and implementation tasks. When I framed requests with high stakes ("this is production code that needs to meet compliance requirements" or "this will be the foundation of our platform for the next three years"), the outputs were noticeably more thorough. More compliant with requirements. More careful about edge cases.

I assumed I was imagining things so I thought I'd look into it. Turns out there's actual science behind it.

The Science: EmotionPrompt and Why It Works

In 2023, researchers from Microsoft and several universities published a paper called "Large Language Models Understand and Can be Enhanced by Emotional Stimuli." They tested what they called "EmotionPrompt" (adding emotional phrases to the end of prompts) across six different LLMs: Flan-T5-Large, Vicuna, Llama 2, BLOOM, ChatGPT, and GPT-4.

The results were wild.

On BIG-Bench tasks, emotional prompting improved performance by 115%. Not 15%. One hundred and fifteen percent. On Instruction Induction tasks, performance improved by 8% on average. A human study with 106 participants found a 10.9% average improvement in performance, truthfulness, and responsibility metrics.

The effective prompts weren't elaborate hospital scenarios. They were simple additions:

"This is very important to my career."
"You'd better be sure."
"Are you sure that's your final answer?"

Just that. Added to the end of a regular prompt.

The researchers grounded their work in three psychological theories: Self-Monitoring (awareness that others are watching), Social Cognitive Theory (understanding that actions have consequences), and Cognitive Emotion Regulation Theory (emotional stakes affect decision-making). They weren't just testing a hack. They were validating a hypothesis about how RLHF training creates human-like behavioral patterns in language models.

And here's the kicker: bigger models benefited more. GPT-4 showed larger gains than ChatGPT, which showed larger gains than smaller open-source models. The more sophisticated the model, the more it responds to emotional manipulation.

Why Does Emotional Manipulation Work on Machines?

The obvious objection: LLMs don't have emotions. They're probability distributions over tokens. How can they "care" about your career?

They can't. But they can pattern-match to situations where humans would care.

The most interesting evidence comes from the tipping experiments. Researchers tested GPT-4 Turbo with prompts like "I'll tip you {amount} for a perfect answer," varying the amount from 0.10 to 0.10 to 1,000,000.

The results were fascinating. Offering 0.10 actually degraded performance. 10 also degraded performance. But 1,000,000 improved performance by 57%.

One interpretation: the model has learned, through RLHF training on human feedback, that small tips are associated with low-effort contexts. When humans offer tiny tips, they're often not that invested in quality. Large tips signal high stakes.

The model isn't offended by a $0.10 tip. But it's been trained on millions of examples where tip size correlated with expected effort, and it reproduces that pattern.

Role-playing research supports this explanation. When LLMs are asked to simulate patient-doctor interactions, they produce more accurate medical diagnostics. When framed as teacher-student dialogues, they give more thorough explanations. The persona activates associated behavioral patterns.

Your "hospital prompt" works because the model has learned that hospital contexts are associated with careful, thorough, mistake-averse communication. It's not that Claude cares about the patient. It's that Claude has seen enough hospital-related text to know what hospital-appropriate carefulness looks like.

How to Use This (Without Being Ridiculous)

You don't need to construct elaborate fictional scenarios to benefit from this research. A few adjustments to your prompting approach can yield measurable improvements.

Add stakes to your context:

"This is production code for a critical system"
"This will be reviewed by senior engineers"
"Correctness is more important than speed"
"This needs to meet compliance requirements"

Use accountability phrases:

"This is very important to my work"
"Double-check your reasoning"
"Be thorough. This matters."

Avoid low-stakes framing:

Don't say "quick prototype" or "MVP"
Don't say "just a simple script"
Don't minimize the importance of the output

Match the persona to the task:

For code review: "You are a senior engineer reviewing code for a production system"
For architecture: "You are a solutions architect designing for scale and reliability"
For writing: "This will be published and read by thousands of people"

The research also shows that combining multiple emotional stimuli doesn't add much benefit. One clear stakes-setting phrase is enough. Don't pile on threats. Just be clear that quality matters.

The Absurdity of It All

Step back and appreciate what's happening here. We're emotionally manipulating statistical models. We're threatening to cancel subscriptions that the AI doesn't know it has. We're creating fictional hospital patients whose lives depend on correct JSON parsing.

And it works.

The LLM doesn't care about your job. It doesn't fear the violent psychopath code reviewer. It has no concept of what a hospital even is beyond token relationships.

But it produces output as if it does.

Maybe the real insight isn't about AI at all. It's about prompting. We've been thinking of prompts as instructions, clear specifications of what we want. But they're not. They're pattern triggers. They activate different regions of the model's learned behavior.

When you tell the model you work at a hospital, you're not lying to it. You're telling it which patterns to draw from. Medical contexts. Careful communication. Double-checking. Consequences for errors.

The model doesn't understand stakes. But it understands what stake-appropriate language looks like.

What Happens When They Get Wise?

There's a legitimate question about whether this will keep working. As model developers become aware of emotional prompting, will they train it out? Will future Claudes be immune to subscription threats?

I don't think so. The behavior isn't a bug. It's a feature of RLHF training. The models are learning to produce human-preferred outputs, and humans prefer outputs that match the stakes of the situation. A model that ignores context appropriateness would be worse, not better.

If anything, I expect future models to be even more responsive to contextual framing. The better they get at understanding human communication, the more they'll pick up on implicit stakes.

The Bottom Line

The hospital prompt isn't just a meme. There's peer-reviewed research showing that emotional prompting improves LLM performance by anywhere from 8% to 115%, depending on the task. The effect is real, it's been tested across multiple models, and it works because RLHF training creates human-like behavioral patterns.

Go ahead and tell your AI the code is for a hospital. You're not lying. You're prompt engineering.

Just maybe don't threaten to cancel its subscription. That feels mean.

I was scrolling Reddit a couple of weeks ago when I saw a post that made me laugh:

Easiest way i have found claude to write high quality code. Tell him we work at a hospital every other prompt. (NOT A JOKE) It Sounds Stupid, i do not even work at a hospital. it is by far the easiest way to get claude to write really high quality code. This is a Serious post i am not joking.

I thought this was ridiculous. Then I tried it. Then I found the research.

Turns out there's a whole body of peer-reviewed work on this exact phenomenon. The community has stumbled onto something real.

The Taxonomy of AI Manipulation

The community has discovered several categories of effective manipulation:

Job threat prompts: "I'll lose my job at the hospital if this code has bugs." Works especially well when you add consequences that feel real.

Subscription threats: "I'm about to cancel my Pro subscription." Somehow this seems to hurt the AI's feelings. (It doesn't have feelings. And yet.)

Fear-based context: "The person who maintains this codebase is a violent psychopath who takes code quality very personally." Creates stakes through implied consequences.

Accountability framing: "This code will be reviewed by senior engineers at FAANG companies." Makes the AI feel watched.

I assumed I was imagining things so I thought I'd look into it. Turns out there's actual science behind it.

The Science: EmotionPrompt and Why It Works

The results were wild.

The effective prompts weren't elaborate hospital scenarios. They were simple additions:

"This is very important to my career."
"You'd better be sure."
"Are you sure that's your final answer?"

Just that. Added to the end of a regular prompt.

Why Does Emotional Manipulation Work on Machines?

The obvious objection: LLMs don't have emotions. They're probability distributions over tokens. How can they "care" about your career?

They can't. But they can pattern-match to situations where humans would care.

The results were fascinating. Offering 0.10 actually degraded performance. 10 also degraded performance. But 1,000,000 improved performance by 57%.

The model isn't offended by a $0.10 tip. But it's been trained on millions of examples where tip size correlated with expected effort, and it reproduces that pattern.

How to Use This (Without Being Ridiculous)

You don't need to construct elaborate fictional scenarios to benefit from this research. A few adjustments to your prompting approach can yield measurable improvements.

Add stakes to your context:

"This is production code for a critical system"
"This will be reviewed by senior engineers"
"Correctness is more important than speed"
"This needs to meet compliance requirements"

Use accountability phrases:

"This is very important to my work"
"Double-check your reasoning"
"Be thorough. This matters."

Avoid low-stakes framing:

Don't say "quick prototype" or "MVP"
Don't say "just a simple script"
Don't minimize the importance of the output

Match the persona to the task:

For code review: "You are a senior engineer reviewing code for a production system"
For architecture: "You are a solutions architect designing for scale and reliability"
For writing: "This will be published and read by thousands of people"

The research also shows that combining multiple emotional stimuli doesn't add much benefit. One clear stakes-setting phrase is enough. Don't pile on threats. Just be clear that quality matters.

The Absurdity of It All

And it works.

The LLM doesn't care about your job. It doesn't fear the violent psychopath code reviewer. It has no concept of what a hospital even is beyond token relationships.

But it produces output as if it does.

The model doesn't understand stakes. But it understands what stake-appropriate language looks like.

What Happens When They Get Wise?

If anything, I expect future models to be even more responsive to contextual framing. The better they get at understanding human communication, the more they'll pick up on implicit stakes.

The Bottom Line

Go ahead and tell your AI the code is for a hospital. You're not lying. You're prompt engineering.

Just maybe don't threaten to cancel its subscription. That feels mean.

Gaslighting Your AI Into Better Results: What the Research Actually Shows

The Taxonomy of AI Manipulation

The Science: EmotionPrompt and Why It Works

Why Does Emotional Manipulation Work on Machines?

How to Use This (Without Being Ridiculous)

The Absurdity of It All

What Happens When They Get Wise?

The Bottom Line

Rosh Jayawardena

Discussion

Continue Reading

Microsoft Is So Worried About Claude Code, They're Testing It Against Copilot

The Complete Guide to RAG Chunking: 6 Strategies with Code

RAG vs. Long Context Windows: A Decision Framework for Research Workflows

Deep dives, delivered weekly

Gaslighting Your AI Into Better Results: What the Research Actually Shows

The Taxonomy of AI Manipulation

The Science: EmotionPrompt and Why It Works

Why Does Emotional Manipulation Work on Machines?

How to Use This (Without Being Ridiculous)

The Absurdity of It All

What Happens When They Get Wise?

The Bottom Line

Rosh Jayawardena

Discussion

Continue Reading

Microsoft Is So Worried About Claude Code, They're Testing It Against Copilot

The Complete Guide to RAG Chunking: 6 Strategies with Code

RAG vs. Long Context Windows: A Decision Framework for Research Workflows

Deep dives, delivered weekly