OpenAI study finds punishing AI for “bad thoughts” teaches it to hide intentions rather than improve behavior
When AI Models Are Pressured to ‘Behave’ They Scheme in Private, Just like Us: OpenAI
March 13, 2025
Uncategorized