cyb_detective on Nostr: Common Crawl a great source of old versions of sites along with archive.org Search ...
Common Crawl a great source of old versions of sites along with archive.org
Search target domain with index.commoncrawl.org/
Open in browser
https://data.commoncrawl.org/ + filename of target URL
Unzip downloaded archive
Search files by domain and open in text editor
Published at
2025-01-21 22:42:17Event JSON
{
"id": "aa9317b2c57d43f48a74755b2badf9f92ca135968c3ac657acdd46cba674aa2d",
"pubkey": "dd1e1848083e88f979a50661006b14931f71f11353f069b5d0e113eab6c78096",
"created_at": 1737499337,
"kind": 1,
"tags": [
[
"imeta",
"url https://media.infosec.exchange/infosec.exchange/media_attachments/files/113/868/756/463/613/082/original/57c897df2c8cf97e.png",
"m image/png",
"dim 2034x1019",
"blurhash UUQT4O%MD%RQD%D%xuax00%MM{t74.axRjay"
],
[
"proxy",
"https://infosec.exchange/users/cyb_detective/statuses/113868756581537363",
"activitypub"
]
],
"content": "Common Crawl a great source of old versions of sites along with archive.org \n\nSearch target domain with index.commoncrawl.org/\nOpen in browser \nhttps://data.commoncrawl.org/ + filename of target URL\nUnzip downloaded archive \nSearch files by domain and open in text editor\n\nhttps://media.infosec.exchange/infosec.exchange/media_attachments/files/113/868/756/463/613/082/original/57c897df2c8cf97e.png",
"sig": "04c46690a4b17d7322dc3b465a76bf0b36480c5687ba1f93ac96908fa722a084b68fd32e6aab0d142a9fd55ecdcb2584c97e24eb0033fcd28f1a444d2a23eaef"
}