Yeah it's not a settled question. I'm not familiar with how Midjourney et al. scraped ...

Why Nostr? What is Njump?

npub1hh…fwm0z

2023-01-07 02:59:37

in reply to nevent1q…ma09

Yeah it's not a settled question. I'm not familiar with how Midjourney et al. scraped all the images to use as training data. If they got an image from a website where scraping/saving/using a copyrighted work violated the terms of use, that's a clearer line, but I'd hope they had the foresight to know that would be an issue. For other more "open" images, it can depend on things like how the model is tuned, what the training actually extracts from the pictures, how the extracted info is stored, and how many images are in the model.

E.g., if I just trained a McDonald's logo generator and scraped 1,000 variations of the golden arches, that starts to looks like copyright infringement. But if those 1,000 images are part of a 100,000,000 image data set, it would look more like fair use. Mind you the question of whether a copyright violation exists for the person generating an image from a dataset is a separate analysis. This is all from the US-perspective though.

Author Public Key

npub1hhnmlmx6ttcwpjglfrsnglfa0tpczgr26mk5vzl9kwqs9h3sdhkq9fwm0z

Show more details

Published at

2023-01-07 02:59:37

Kind type

1 Short Text Note

Event JSON

{ "id": "53330b1fdf2ff1550b1eaee03ca4ca82520d026ce8a222934d0c476959a85f18", "pubkey": "bde7bfecda5af0e0c91f48e1347d3d7ac381206ad6ed460be5b38102de306dec", "created_at": 1673060377, "kind": 1, "tags": [ [ "p", "b5db1aacc067a056350c4fcaaa0f445c8f2acbb3efc2079c51aaba1f35cd8465", "wss://relay.nostr.pro" ], [ "p", "82341f882b6eabcd2ba7f1ef90aad961cf074af15b9ef44a09f9d2a8fbfbe6a2", "wss://relay.nostr.bg" ], [ "p", "1577e4599dd10c863498fe3c20bd82aafaf829a595ce83c5cf8ac3463531b09b", "wss://relay.nostr.bg" ], [ "p", "c48e29f04b482cc01ca1f9ef8c86ef8318c059e0e9353235162f080f26e14c11", "wss://brb.io" ], [ "p", "a3eb29554bd27fca7f53f66272e4bb59d066f2f31708cf341540cb4729fbd841", "wss://nostr.v0l.io" ], [ "p", "ac9ec020170155f0feb347f0d777ee5fc38dd1f36353093046323646cff5169f", "wss://nostr.v0l.io" ], [ "e", "f492eb56eac33d15b27807f8d3a16752054995af3431bf4f73186e3e792966cd", "wss://relay.nostr.bg", "root" ], [ "e", "08245bc40b608479986698949933b9ed8abfa38621299d3003a7dd3d19b9f0b2", "wss://relay.damus.io", "reply" ], [ "client", "astral" ] ], "content": "Yeah it's not a settled question. I'm not familiar with how Midjourney et al. scraped all the images to use as training data. If they got an image from a website where scraping/saving/using a copyrighted work violated the terms of use, that's a clearer line, but I'd hope they had the foresight to know that would be an issue. For other more \"open\" images, it can depend on things like how the model is tuned, what the training actually extracts from the pictures, how the extracted info is stored, and how many images are in the model.\n\nE.g., if I just trained a McDonald's logo generator and scraped 1,000 variations of the golden arches, that starts to looks like copyright infringement. But if those 1,000 images are part of a 100,000,000 image data set, it would look more like fair use. Mind you the question of whether a copyright violation exists for the person generating an image from a dataset is a separate analysis. This is all from the US-perspective though.", "sig": "a827f433fa9f12e7851b0c92a1f4df89edab54f617edca236b157632b5b3fde93fb8295b592238e0882d95335e318e07d8d79b07ed7176d8ac2264ce55710a8c" }