Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been thinking the same.

I'm still tempted to post things online for my own reference. And I believe we benefit from increasing the common wealth of information online.

But the fact my own posts are feeding the thing that will cause massive redundances in my industry, ultimately being a detriment to my financial worth, is causing me pause for thought.

Because ChatGPT isn't just a way to share the information I put online. It'll eventually, in a year I'd guess, replace me and my colleagues; it'll be regurgitating fully formed projects, and probably learn to do the ancillory activies.



I believe it's completely delusional to expect the court system will regard "training an AI system and distilling millions of works into a commercial model" as anything other than creating a derived work based on those works, an act that is not covered by any traditional forms of fair use and therefore illegal unless explicitly authorized by each of the millions authors of those works.

We are living through the early Napster years when people were saying with a straight face that "everything has changed now in regards to intelectual property" and "information [specifically, music in mp3 format] wants to be free". In Morgan Freeman's narration, they were about to find that little had, in fact, changed.


Most of the raw material used is available for free with no copyright on the internet, though. I have no desire to charge for my posts that are mainly for my own reference. Nor do the people who are answering on stackoverflow.

But when all those are harvested to cause mass redundances there's a difference. But, at least with those two cases above, essentially reference material and technical answers, I doubt there's a copyright problem.

What could regulators even do? Force the model owners to regurgitate source material references? Force the model owners to pay a small fee pemr post used? Difficult when most of the material is freely available online with ads.

Besides a lot of the material will come from third parties who have scraped the posts. A whole branch of regulation could appear to try to track the source material which has been put online for free. Or even scraped and then put online again sans copyright to be rescraped without the need to pay anyone.

Maybe we can ask ChatGPT to search its databases for its own sources to its own answers. And even check the source is the original. Problem solved I guess... Feed the overbrain and it'll throw you some pennies while it's used to do you out of a job.


There is no need to invent new regulatory methods when the one we have is perfectly sufficient: if somebody can prove your AI model is tainted with unlicensed works, regardless of how you acquired them, then your model as a whole is an infringing work and the affected party can sue you for damages far exceeding the pennies of utility you gained from the action. Your stock prices tanks, your corporate customers cease to purchase your models and you go bankrupt.

It's exactly like the current regime of copyright where I could, in principle, copy paste a file from the Linux kernel and compile it into my binary application, and nobody would know. How much would a single file from a work with tens of thousands of contributors possibly be worth, right? Wrong, it takes a single disgruntled employee (which you are guaranteed to have when you exceed a headcount of roughly 5) to destroy your business and product. The only possible way to avoid this is to train on either public/open sources or get positive authorization for each and every file you slurp for the specific use of AI training, which you definitely won't get for pennies.

As for the inevitable dominance of our AI overbrains fed on open source information, I for one, welcome them. The cat is out of the bag, it's not like we can return to the previous state of affairs. The problem, as always, becomes a political one, how to distribute the fruits of these new technical capabilities to the (human) citizens.


Isn't there a similar problem with Spotify? That's not been solved via political means. The artists are getting shafted. Hard to think the same won't be true for the unpaid feeders of ChatGPT.


I think in the particular case of music the market is saturated by an abundance of human generated content, not machine produced.

Music is one of those human endeavors where it is very desirable to succeed, to the point where many people are willing to do it for free for love of the craft. I can't see the immense profits Spotify allegedly makes by exploiting the artists.

Perhaps we need to accept that the artists are not so much getting shafted, but simply that the age of the superstar is over and most music will be free.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: