• General_Effort@lemmy.worldOP
    link
    fedilink
    arrow-up
    2
    ·
    4 days ago

    Not comparable.

    Samples are actual copies which are part of a song. Someone might claim that a hip hop artist just steals the good bits of other people’s songs and mashes them together without contributing any meaningful creativity on their own. Well. History shows that such arguments were quite foolish. Nevertheless, the copies are there, and they do add value to the new song.

    To get an LLM model to spit out training data takes careful manipulation by the user. This rarely happens by accident. It also does not add value to the model. It does the opposite: The possibility of accidentally violating copyright lowers the value.

    • fluxion@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 days ago

      It only lowers the value if you don’t blanketdly shield AI from lawsuits just because “AI” or “LLM”. There needs to be a higher bar before you can consider the input “transformed” otherwise it will continue to be abused in the laziest/cheapest way possible

        • fluxion@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 days ago

          It means loading copyrighted material into your training data does not inherently absolve you of copyright liability, otherwise there’s no reason not to have chatgpt spit out full Dr Suess books if you ask for a story.

          • General_Effort@lemmy.worldOP
            link
            fedilink
            arrow-up
            2
            ·
            3 days ago

            Yes, Otherwise it wouldn’t lower the value.

            There is a lot of disinformation being spread, maybe to influence juries, or maybe to undermine the already beleaguered rule of law in the US. The truth is that there is very little unexpected about these judgments. That’s how fair use works.