There's an even bigger picture than possibly monetizing ecommerce revenue (through... ads?). The biggest impact is that they get to use all the content generated on the Internet to create these search "results" that synthesize information from multiple sources without ever having to share traffic or ad revenue with those content sources. Clever.
This really is a section that needs regulation. You basically have to use and allow Google to crawl your site if you want a website findable by 95%+ of Americans, so websites really should be able to tell google how they're allowed to use the scraped data instead of just 'for anything'. Maybe a meta tag would work well.
> websites really should be able to tell google how they're allowed to use the scraped data
Isn't it a bit more complicated than that though? While you certainly aren't entitled to republish things, you (ie anyone and everyone) have traditionally been free to consume public material in whatever way you see fit. The precedent from the recent LinkedIn case regarding scraping supports this.
Also you focus on Google, but anyone with sufficient resources can scrape the public web (anti-bot cat and mouse games notwithstanding I suppose).
(Paywalled sites that allow Google to scrape them for search indexing purposes are an interesting edge case though.)
I'm not proposing a link tax, i'm just saying that websites should be able to opt-out from their content being used for some Google products just because they have to give google a very permissive license to transact online business and be found in search results.
I am sure your left field question has a fancy term in the encyclopedia of fallacies, but are you equating Google's crawlers and AI equivalent of humans.
Even if you got an answer to your question, how would it affect the the parents assertion?
Should the internet provider of the person reading the blog post get royalties for the book the person wrote? Should you be paid for your snarky question, as the generator of content?
We shouldn't extend IP protections to general knowledge gleaned is my point. If we start overreaching into that for AI there would be little stopping overreaching into that for humans.
There is a reasonable risk that future research material will not be publicly available by the traditional copyright system. Effectively an "uber" google could leverage all previous content to answer your query severely limiting the utility of producing any new content or maintaining accessibility to pre-existing content.
If you break the distribution model of content producers there won't be any new content, and the old content will simply rot away.
Record disks broke the distribution of taverns and pubs.
Radio broke the distribution model of record disks.
YouTube broke the distribution model of old content producers.
People still make music and more content than ever before. People are not gonna suddenly stop generating knowledge because of Wikipedia. The prediction that "there won't be any new content" is hyperbolic and just plain wrong as demonstrated by history.
When the content people create is pushed aside and instead used by a giant corp. without any payment, it is safe to assume that people will stop writing content.
Books are typically paid for indeed (even if you get them through a library etc.).
Blog posts are an edge case, but the rise of paywalled sites indicate that more authors think they should be paid for their work.
The interesting point here is: can a non-free site let Google crawl it in order to be searchable, while not allowing them to exploit their content otherwise?
The site can ask for such a setup, but Google is very much in a position to refuse and just don't index that site, unless it's one of, say, top 1000 sites.
I don't see any legal ground for such a limitation, so I can't imagine what would e.g. eBay or Walmart put in their suit if they tried to put such a limitation and then saw Google not honor it. Maybe someone with a legal background could comment.
There isn't currently a legal ground for that (in the US). The context of this thread, though, was to propose that such grounds ought to be introduced by the legislature.
I can see the rationale but at some point Google needs to make money on what it contributes to the picture.
On one hand, Google is in large part responsible for the idea that they’re just a free service and anyone can use them without payment, so they’ve trained users to in essence think of them as a non-profit utility which they’re obviously not.
On the other hand, how do you allocate Google’s value when no-one has ever paid for it or been asked to pay for it?
You can legislate Google into the ground (unlikely, but theoretically possible), but what would replace it? Any other similar service operating at their scale would need to make money too.
I don't know if this is actually true or not but I suspect that a big part of their thinking is that "we are just presenting 'facts' and facts are not subject to copyright laws".
Forgive the analogy but that sounds like a parasitic relationship, and one that might kill off, or at least impoverish, its host. Even if Google isn't doing that, the potential exists. The counter is paywall, I suppose.