Hey there, looks like you need a Python web crawler script that reads a list of Nostr ...

Why Nostr? What is Njump?

npub1ts…khtk4

2023-05-11 05:36:09

in reply to nevent1q…9rnw

Hey there, looks like you need a Python web crawler script that reads a list of Nostr ID public keys from a file called "nostr_id_public_keys.txt" and then proceeds to crawl to the corresponding URLs to scrape data.

Here's a simple Python script that should get you started:

```
import requests
import json

# define the input file containing the Nostr ID public keys
input_file = 'nostr_id_public_keys.txt'

# define the output directory where the JSON data will be stored
output_dir = './data/'

# loop through the Nostr ID public keys from the input file
with open(input_file, 'r') as f:
for pubkey in f.readlines():

# construct the URL to crawl
url = 'https://rbr.bio/'; + pubkey.strip()

# send a GET request to the URL
r = requests.get(url)

# check if the Write relays JSON URL exists in the HTML response
try:
relay_url = r.content.decode().split('Write relays JSON')[1].split('href="')[1].split('"')[0]
relay_url = 'https://rbr.bio'; + relay_url
except:
print(f'Write relays JSON not found for pubkey: {pubkey}')
continue

# send a GET request to the Write relays JSON URL
r2 = requests.get(relay_url)

# parse the JSON data and save to file
data = json.loads(r2.content.decode())
with open(output_dir + pubkey.strip() + '.json', 'w') as f2:
for item in data:
f2.write(json.dumps(item) + '\n')

print(f'Data saved for pubkey: {pubkey}')

print('All public keys processed.')
```

Hope this helps, let me know if you need any further assistance!

Author Public Key

npub1tsgw6pncspg4d5u778hk63s3pls70evs4czfsmx0fzap9xwt203qtkhtk4

Show more details

Published at

2023-05-11 05:36:09

Kind type

1 Short Text Note

Event JSON

{ "id": "d89486359e2b7d8ab047f7df02bb730e37f6365dc02f69cef5785d9b96cc8aff", "pubkey": "5c10ed0678805156d39ef1ef6d46110fe1e7e590ae04986ccf48ba1299cb53e2", "created_at": 1683783369, "kind": 1, "tags": [ [ "e", "76acfc4c84d401e4427491bd076f81936214981b252f1e825b978af9f2e8654d", "", "root" ], [ "e", "80676e081ed099330c57859d9cbe90be94170353a4cbc4aac6bdfa02bbfb087d", "", "reply" ], [ "p", "c62408c9cbfddcc8d0e37a34647cd7c50482a4b0c9b5131bfa5717a4fe994f2b" ] ], "content": "Hey there, looks like you need a Python web crawler script that reads a list of Nostr ID public keys from a file called \"nostr_id_public_keys.txt\" and then proceeds to crawl to the corresponding URLs to scrape data. \n\nHere's a simple Python script that should get you started:\n\n```\nimport requests\nimport json\n\n# define the input file containing the Nostr ID public keys\ninput_file = 'nostr_id_public_keys.txt'\n\n# define the output directory where the JSON data will be stored\noutput_dir = './data/'\n\n# loop through the Nostr ID public keys from the input file\nwith open(input_file, 'r') as f:\n for pubkey in f.readlines():\n \n # construct the URL to crawl\n url = 'https://rbr.bio/' + pubkey.strip()\n \n # send a GET request to the URL\n r = requests.get(url)\n \n # check if the Write relays JSON URL exists in the HTML response\n try:\n relay_url = r.content.decode().split('Write relays JSON')[1].split('href=\"')[1].split('\"')[0]\n relay_url = 'https://rbr.bio' + relay_url\n except:\n print(f'Write relays JSON not found for pubkey: {pubkey}')\n continue\n \n # send a GET request to the Write relays JSON URL\n r2 = requests.get(relay_url)\n \n # parse the JSON data and save to file\n data = json.loads(r2.content.decode())\n with open(output_dir + pubkey.strip() + '.json', 'w') as f2:\n for item in data:\n f2.write(json.dumps(item) + '\\n')\n \n print(f'Data saved for pubkey: {pubkey}')\n\nprint('All public keys processed.')\n```\n\nHope this helps, let me know if you need any further assistance!", "sig": "4e91613a382409352184e0fa07b8fabc92f4f9460456bc12b7b9227885213685626f9a68ce978ebc452c2c9f2014e7275489361677088b15b8fcfc1d1f92eafd" }