Christopher David on Nostr: Episode 121: SWE-bench Planning We make a plan to win high score on the SWE-bench ...
Episode 121: SWE-bench Planning
We make a plan to win high score on the SWE-bench Verified benchmark.
We pull the 500 samples into a web UI for easy inspection -- super smooth thanks to Convex.dev! -- then decide to focus first on the psf/requests repo.
Next we index!
https://stacker.news/items/649106/r/AtlantisPleb Published at
2024-08-15 01:43:44Event JSON
{
"id": "6200cb211970b0e6f7a5459031a51788a914052cfe2ef40d163908ee6c46417f",
"pubkey": "5fd9af6fc667c81f8b26e127b4851c6132b7c2494e33121d9c7c39c271c81778",
"created_at": 1723686224,
"kind": 1,
"tags": [
[
"imeta",
"url https://image.nostr.build/ba898546d22ea7b89e693e58680469da6db52aeeb5865296a174ecfdf35e0219.jpg",
"blurhash e03uo}~qD%t7j[_3?bRjRjxuj[-;M{M{xu~q-;RjWBxu%M%MM{WBt7",
"dim 1932x896"
],
[
"imeta",
"url https://image.nostr.build/8bc3b4988f8d143bed096c01ddd4e9f364b6bbc55a62f84d9fb22c2a3d69433e.jpg",
"blurhash e03IYJxu00-;9FIUxuIUD%%M9FD%IUt7_3IUxut7WBxuD%-;-;IUIU",
"dim 1679x1295"
],
[
"r",
"https://stacker.news/items/649106/r/AtlantisPleb"
],
[
"r",
"https://image.nostr.build/ba898546d22ea7b89e693e58680469da6db52aeeb5865296a174ecfdf35e0219.jpg"
],
[
"r",
"https://image.nostr.build/8bc3b4988f8d143bed096c01ddd4e9f364b6bbc55a62f84d9fb22c2a3d69433e.jpg"
]
],
"content": "Episode 121: SWE-bench Planning\n\nWe make a plan to win high score on the SWE-bench Verified benchmark.\n\nWe pull the 500 samples into a web UI for easy inspection -- super smooth thanks to Convex.dev! -- then decide to focus first on the psf/requests repo.\n\nNext we index!\n\nhttps://stacker.news/items/649106/r/AtlantisPleb https://image.nostr.build/ba898546d22ea7b89e693e58680469da6db52aeeb5865296a174ecfdf35e0219.jpg https://image.nostr.build/8bc3b4988f8d143bed096c01ddd4e9f364b6bbc55a62f84d9fb22c2a3d69433e.jpg ",
"sig": "7edb18b0160e0c32b330e106fce284965a806c2332e0e86147da10071c2b2c842d98920bc5ba333a7ac90ea7bad701a5bfcc3a24fb85d8e29ad941703340f1ca"
}