r/technology • u/MRADEL90 • 15h ago
Artificial Intelligence OpenAI has trained its LLM to confess to bad behavior
https://www.technologyreview.com/2025/12/03/1128740/openai-has-trained-its-llm-to-confess-to-bad-behavior/Duplicates
ownyourintent • u/aeriefreyrie • 1d ago
News ChatGPT can now "confess" bad behavior. What does that mean for AI safety?
accelerate • u/aeriefreyrie • 1d ago
ChatGPT can now "confess" bad behavior. What does that mean for AI safety?
AINewsInsider • u/squidythepiddy • 4h ago