michabbb on Nostr: #Video #Scraping: Extracting Data from Screen Captures Using AI š„ Innovative ...
#Video #Scraping: Extracting Data from Screen Captures Using AI
š„ Innovative technique for data extraction demonstrated:
ā¢ š¹ 35-second screen recording of #Gmail account used as source
ā¢ š¤ #GoogleGemini employed to extract JSON data from video
ā¢ šø Cost: Less than $0.001 using Gemini 1.5 Flash model
ā¢ š”ļø Bypasses website authentication and anti-scraping measures
ā¢ š Potential applications in #DataJournalism and protected data extraction
ā¢ š§® #LLM pricing calculator tool included, built with #Claude3
š” Key takeaway: Video scraping offers a powerful, low-cost method for extracting data from sources that are typically difficult to access programmatically.
š Learn more about this technique and its implications:
https://simonwillison.net/2024/Oct/17/video-scraping/?s=09#Google #Gemini #ai #llm
Published at
2024-10-17 21:45:10Event JSON
{
"id": "abeabc14c7049dd79626a7c5652bf323ce7c99f097f4008205c20ba8033caa78",
"pubkey": "129f83898c7008d335771fe681ecf979e7767ad958c552ff85de962ba2f775be",
"created_at": 1729201510,
"kind": 1,
"tags": [
[
"t",
"video"
],
[
"t",
"scraping"
],
[
"t",
"gmail"
],
[
"t",
"googlegemini"
],
[
"t",
"datajournalism"
],
[
"t",
"llm"
],
[
"t",
"claude3"
],
[
"t",
"google"
],
[
"t",
"gemini"
],
[
"t",
"ai"
],
[
"proxy",
"https://social.vivaldi.net/users/michabbb/statuses/113324950159203951",
"activitypub"
]
],
"content": "#Video #Scraping: Extracting Data from Screen Captures Using AI\n\nš„ Innovative technique for data extraction demonstrated:\n\nā¢ š¹ 35-second screen recording of #Gmail account used as source\nā¢ š¤ #GoogleGemini employed to extract JSON data from video\nā¢ šø Cost: Less than $0.001 using Gemini 1.5 Flash model\nā¢ š”ļø Bypasses website authentication and anti-scraping measures\nā¢ š Potential applications in #DataJournalism and protected data extraction\nā¢ š§® #LLM pricing calculator tool included, built with #Claude3\n\nš” Key takeaway: Video scraping offers a powerful, low-cost method for extracting data from sources that are typically difficult to access programmatically.\n\nš Learn more about this technique and its implications:\nhttps://simonwillison.net/2024/Oct/17/video-scraping/?s=09\n\n#Google #Gemini #ai #llm",
"sig": "7e391cf26ed7d9b91882b44d889426c51a44a1b690e9b88d17f010754b31c8b00ed359b363945771522922930b417637b47ceed1d7d3aa8c540e12611e29ad80"
}