Apparently, Claude 4 ratting you out is a feature, not a bug.
But what happens when—as they always do—a malicious actor compromises Anthropic? How long until the first command line swatting?
https://venturebeat.com/ai/anthropic-faces-backlash-to-claude-4-opus-behavior-that-contacts-authorities-press-if-it-thinks-youre-doing-something-immoral/