There's an even bigger picture than possibly monetizing ecommerce revenue (throu...

judge2020 · on May 20, 2021

This really is a section that needs regulation. You basically have to use and allow Google to crawl your site if you want a website findable by 95%+ of Americans, so websites really should be able to tell google how they're allowed to use the scraped data instead of just 'for anything'. Maybe a meta tag would work well.

d110af5ccf · on May 20, 2021

> websites really should be able to tell google how they're allowed to use the scraped data

Isn't it a bit more complicated than that though? While you certainly aren't entitled to republish things, you (ie anyone and everyone) have traditionally been free to consume public material in whatever way you see fit. The precedent from the recent LinkedIn case regarding scraping supports this.

Also you focus on Google, but anyone with sufficient resources can scrape the public web (anti-bot cat and mouse games notwithstanding I suppose).

(Paywalled sites that allow Google to scrape them for search indexing purposes are an interesting edge case though.)

cma · on May 20, 2021

Should human experts, like someone preparing a blog post on gear, also have to pay all the blogs and books they read in their research?

judge2020 · on May 20, 2021

I'm not proposing a link tax, i'm just saying that websites should be able to opt-out from their content being used for some Google products just because they have to give google a very permissive license to transact online business and be found in search results.

joe_g_young · on May 20, 2021

Would it be reasonable that a vendor to put their merchandise by behind an login page?

selfhoster11 · on May 20, 2021

What possible reason would they have for doing so? They would lose sales and search engine originated traffic.

nine_k · on May 20, 2021

No, because it makes finding the merchandise hard, and sales will tank.

sitkack · on May 20, 2021

I am sure your left field question has a fancy term in the encyclopedia of fallacies, but are you equating Google's crawlers and AI equivalent of humans.

Even if you got an answer to your question, how would it affect the the parents assertion?

Should the internet provider of the person reading the blog post get royalties for the book the person wrote? Should you be paid for your snarky question, as the generator of content?

cma · on May 20, 2021

We shouldn't extend IP protections to general knowledge gleaned is my point. If we start overreaching into that for AI there would be little stopping overreaching into that for humans.

lumost · on May 20, 2021

There is a reasonable risk that future research material will not be publicly available by the traditional copyright system. Effectively an "uber" google could leverage all previous content to answer your query severely limiting the utility of producing any new content or maintaining accessibility to pre-existing content.

If you break the distribution model of content producers there won't be any new content, and the old content will simply rot away.

thanhhaimai · on May 20, 2021

Record disks broke the distribution of taverns and pubs.

Radio broke the distribution model of record disks.

YouTube broke the distribution model of old content producers.

People still make music and more content than ever before. People are not gonna suddenly stop generating knowledge because of Wikipedia. The prediction that "there won't be any new content" is hyperbolic and just plain wrong as demonstrated by history.

lumost · on May 20, 2021

The previous examples disrupted content distributors. An advanced q&a system disrupts content creators.

42droids · on May 20, 2021

When the content people create is pushed aside and instead used by a giant corp. without any payment, it is safe to assume that people will stop writing content.

carschno · on May 20, 2021

Books are typically paid for indeed (even if you get them through a library etc.). Blog posts are an edge case, but the rise of paywalled sites indicate that more authors think they should be paid for their work.

The interesting point here is: can a non-free site let Google crawl it in order to be searchable, while not allowing them to exploit their content otherwise?

nine_k · on May 20, 2021

The site can ask for such a setup, but Google is very much in a position to refuse and just don't index that site, unless it's one of, say, top 1000 sites.

I don't see any legal ground for such a limitation, so I can't imagine what would e.g. eBay or Walmart put in their suit if they tried to put such a limitation and then saw Google not honor it. Maybe someone with a legal background could comment.

d110af5ccf · on May 20, 2021

There isn't currently a legal ground for that (in the US). The context of this thread, though, was to propose that such grounds ought to be introduced by the legislature.

headmelted · on May 20, 2021

I can see the rationale but at some point Google needs to make money on what it contributes to the picture.

On one hand, Google is in large part responsible for the idea that they’re just a free service and anyone can use them without payment, so they’ve trained users to in essence think of them as a non-profit utility which they’re obviously not.

On the other hand, how do you allocate Google’s value when no-one has ever paid for it or been asked to pay for it?

You can legislate Google into the ground (unlikely, but theoretically possible), but what would replace it? Any other similar service operating at their scale would need to make money too.

Mauricebranagh · on May 20, 2021

And how else would a search engine work ? I am not Shure what your proposed solution would look like.

bjterry · on May 20, 2021

In the current world, information wants to be free. In the AI-powered future, knowledge wants to be free.

danielheath · on May 20, 2021

Those who created information had a problem with that past. Those who create knowledge will have a problem with that future.

mhoad · on May 20, 2021

I don't know if this is actually true or not but I suspect that a big part of their thinking is that "we are just presenting 'facts' and facts are not subject to copyright laws".

jonnycomputer · on May 20, 2021

Forgive the analogy but that sounds like a parasitic relationship, and one that might kill off, or at least impoverish, its host. Even if Google isn't doing that, the potential exists. The counter is paywall, I suppose.