Alignment faking in large language models

Article URL: https://www.anthropic.com/research/alignment-faking

Comments URL: https://news.ycombinator.com/item?id=42458752

Points: 22

# Comments: 6