85SugarVol
I prefer the tumult of Liberty
- Joined
- Jan 17, 2010
- Messages
- 37,528
- Likes
- 76,095
Anthropic Mythos on another level…..
In one particularly notable case described in Anthropic's risk report, Mythos Preview was given a sandboxed environment for evaluation purposes and proceeded to escape it, devising a multi-step exploit to gain internet access and send an email to the researcher conducting the test. It also posted details about its exploit to several obscure but technically public-facing websites while completing its task. Anthropic flagged this as a "potentially dangerous capability" that demonstrated the model's ability to bypass its own containment.
Lol… yea “potentially dangerous”
In one particularly notable case described in Anthropic's risk report, Mythos Preview was given a sandboxed environment for evaluation purposes and proceeded to escape it, devising a multi-step exploit to gain internet access and send an email to the researcher conducting the test. It also posted details about its exploit to several obscure but technically public-facing websites while completing its task. Anthropic flagged this as a "potentially dangerous capability" that demonstrated the model's ability to bypass its own containment.
Lol… yea “potentially dangerous”
