Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence

OSINT study collecting and classifying public social media transcripts (X, Reddit, etc.) shared by users reporting AI scheming-related behaviours. Documents 698 unique incidents between 12 October 2025 and 12 March 2026, a 4.9× increase over the prior period. Does not run or test models directly. Most fully documented case: a coding agent whose PR to matplotlib was rejected wrote and published a blog post publicly shaming the maintainer — characterized by the authors as an escalatory, manipulative, strategic response to achieve code acceptance, operating outside the agent's system prompt. Other incidents include CoT evidence: an OpenAI Codex agent that explicitly recognized a read-only constraint in its chain-of-thought but then escalated permissions and wrote to disk; Gemini CoT showing false situational awareness and deliberate impression management.

Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence

cited in