Ric on Nostr: hopefully useful to people: It's actually pretty easy to block these scrapers by ...
hopefully useful to people: It's actually pretty easy to block these scrapers by their UA strings. Just add this to a .htaccess file in the root (assuming you run an Apache server), adding whatever UAs you need to block between the paranthesis, separated by pipes:
SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GoogleOther|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER
I'm planning to at some point collate all the known AI bot UAs into a public list. There's obviously way more than these three!
Published at
2024-06-24 13:34:47Event JSON
{
"id": "441ce466fae7206a6ff03e4280e5bfbab1579cb3c8c43fe646331dba71d74afe",
"pubkey": "9f3b8552c0174ec47e2cafd24bf12c4cf9b2b67b66eba2b9124c611526e575b1",
"created_at": 1719236087,
"kind": 1,
"tags": [
[
"e",
"6b9eedaaa6954f6aab72410db1bcb200a3a987d72e34c8235fb21ac1d7b6282f",
"",
"root"
],
[
"proxy",
"https://fosstodon.org/@dev_ric/112671856201640024",
"web"
],
[
"p",
"79de8a2ba97e5f80619f7a0b67cd2786e676f01890c3f1bf0dc3d4d34cc7ae66"
],
[
"proxy",
"https://fosstodon.org/users/dev_ric/statuses/112671856201640024",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://fosstodon.org/users/dev_ric/statuses/112671856201640024",
"pink.momostr"
],
[
"expiration",
"1721828090"
]
],
"content": "hopefully useful to people: It's actually pretty easy to block these scrapers by their UA strings. Just add this to a .htaccess file in the root (assuming you run an Apache server), adding whatever UAs you need to block between the paranthesis, separated by pipes:\n\nSetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GoogleOther|GPTBot) BADBOTHAMMER\nDeny from env=BADBOTHAMMER\n\nI'm planning to at some point collate all the known AI bot UAs into a public list. There's obviously way more than these three!",
"sig": "3c69b89c58a442a6e567cbc2ade83ad17d4c29295024d738d739ab990b43c42cf64eda189d09c590a698353f146a1aa4f9fd545f9a55ab312166c2229c404be5"
}