

given a particular prompt/keyword, which might reproduce the original training data almost in it’s entirety given similar set of prompt or set of keywords.
What you describe here is called memorization and is generally considered a flaw/bug and not a feature, this happens with low quality training data or not enough data. As far as I understand this isn’t a problem on frointer llms with the large datasets they’ve been trained on.
Eitherway, just like a photocopier an llm can be used to infringe copyright if that’s what someone is trying to do with it, the tool itself does not infringe anything.
Just block his name, I have trump and Elon and varations of their names black listed so posts with them don’t show up on my feed.