Mike Watts on Nostr: Giving instructions in hexadecimal can defeat #AI guardrails, in this case tricking ...
Published at
2024-11-08 02:38:07Event JSON
{
"id": "f124282bdf7b322336f80d980521ce867489e38874ac16b9428175fbf09efe33",
"pubkey": "a0a4d86362e6f951e6f2109b1d977b56cc1cbcc8ae3d82e52554ace8f39f9dee",
"created_at": 1731033487,
"kind": 1,
"tags": [
[
"t",
"ai"
],
[
"t",
"artificialintelligence"
],
[
"proxy",
"https://mastodon.social/users/DrMikeWatts/statuses/113445010617897923",
"activitypub"
]
],
"content": "Giving instructions in hexadecimal can defeat #AI guardrails, in this case tricking ChatGPT into writing exploit code: https://www.theregister.com/2024/10/29/chatgpt_hex_encoded_jailbreak/ #ArtificialIntelligence",
"sig": "fc7a0ae3511bb6f8534b0fa62cd42fe98b1d5a6ead4e99f431da653df95d2cb41060a253011c3cb3c4b3545f210ffa4fec03c8520b89163bbaf61d57b7356647"
}