Google Starts Public Discussion on How AI Systems Access Web Content

Google has called for a public discussion on how content is accessed and mined by AI systems, with the view to creating a set of “technical and ethical standards”.

In a blog post, Google said it was important that publishers have “meaningful choice and control” over how AI systems use their content and data to train algorithms.

The search giant admits that current technical standards for data are not sufficient as they were created more than 20 years ago.

And especially as modern AI tech has evolved at a significant pace in recent years.

However, Google’s call to action has been criticised in some quarters as it benefitted from the current lax rules to train its own systems.

SEO consultant Barry Adams believes new protocols and guidelines should have been in place prior to the explosion in AI usage.

Another marketer said the discussion is unlikely to generate actionable feedback as Google has only provided an email capture form for responses.

However, it is hard to argue that new standards wouldn’t be beneficial for publishers who would finally be able to have some say on how their data is being used.

Google says these standards will ensure publishers have full control over whether they participate in the “web ecosystem” or not.

The current robots.txt standard does this to a certain degree but it only sets out how search engines can crawl and index content.

AI companies generally have free rein to utilise data, in a process known as “data scraping”, to train their systems.

Some experts say this supports progress and prevents anything from undermining the advancement of AI.

But Google concluded its letter by stating that there needs to be a “balance” so every party involved is satisfied with how content and data are accessed and leveraged.

News

Google Starts Public Discussion on How AI Systems Access Web Content

Have a question?