Meta Wins Blockbuster AI Copyright Case—but There’s a Catch

General_Effort@lemmy.world · 3 days ago

Meta Wins Blockbuster AI Copyright Case—but There’s a Catch

fluxion@lemmy.world · 2 days ago

Yes, weights for individual words/phrases/token which, given a particular prompt/keyword, which might reproduce the original training data almost in it’s entirety given similar set of prompt or set of keywords. Hence why it is so obvious when these models have been trained on copyrighted material.

Similarly, I don’t digitally store music in my head verbatim, I store some fuzzy version that i can still reproduce fairly closely when prompted, and still get sued if I’m charging money for performing or recording it, because the “weightings” in my neurons are just an implementation detail of how my brain works and not some active/purposeful attempt to transform the music in any appreciable way.

Zetta@mander.xyz · 2 days ago

given a particular prompt/keyword, which might reproduce the original training data almost in it’s entirety given similar set of prompt or set of keywords.

What you describe here is called memorization and is generally considered a flaw/bug and not a feature, this happens with low quality training data or not enough data. As far as I understand this isn’t a problem on frointer llms with the large datasets they’ve been trained on.

Eitherway, just like a photocopier an llm can be used to infringe copyright if that’s what someone is trying to do with it, the tool itself does not infringe anything.