Skip to main content

Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

The Internet Archive has often been a valuable resource for journalists, from it's finding records of deleted tweets or providing academic texts for background research. However, the advent of AI has created a new tension between the parties. A few major publications have begun blocking the nonprofit digital library's access to their content based on concerns that AI companies' bots are using the Internet Archive's collections to indirectly scrape their articles.

"A lot of these AI businesses are looking for readily available, structured databases of content," Robert Hahn, head of business affairs and licensing for The Guardian, told Nieman Lab. "The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP."

The New York Times took a similar step. "We are blocking the Internet Archive's bot from accessing the Times because the Wayback Machine provides unfettered access to Times content — including by AI companies — without authorization," a representative from the newspaper confirmed to Nieman Lab. Subscription-focused publication the Financial Times and social forum Reddit have also made moves to selectively block how the Internet Archive catalogs their material.

Many publishers have attempted to sue AI businesses for how they access content used to train large language models. To name a few just from the realm of journalism:

Other media outlets have sought financial deals before offering up their libraries as training material, although those arrangements seem to provide compensation to the publishing companies rather than the writers. And that's not even delving into the copyright and piracy issues also being fought against AI tools by other creative fields, from fiction writers to visual artists to musicians. The whole Nieman Lab story is well worth a read for anyone who has been following any of these creative industries’ responses to artificial intelligence.

This article originally appeared on Engadget at https://ift.tt/waxuPFR

from Engadget is a web magazine with obsessive daily coverage of everything new in gadgets and consumer electronics https://ift.tt/waxuPFR
via IFTTT

Comments

Popular posts from this blog

Instagram accidentally reinstated Pornhub’s banned account

After years of on-and-off temporary suspensions, Instagram permanently banned Pornhub’s account in September. Then, for a short period of time this weekend, the account was reinstated. By Tuesday, it was permanently banned again. “This was done in error,” an Instagram spokesperson told TechCrunch. “As we’ve said previously, we permanently disabled this Instagram account for repeatedly violating our policies.” Instagram’s content guidelines prohibit  nudity and sexual solicitation . A Pornhub spokesperson told TechCrunch, though, that they believe the adult streaming platform’s account did not violate any guidelines. Instagram has not commented on the exact reasoning for the ban, or which policies the account violated. It’s worrying from a moderation perspective if a permanently banned Instagram account can accidentally get switched back on. Pornhub told TechCrunch that its account even received a notice from Instagram, stating that its ban had been a mistake (that message itse...

MVP versus EVP: Is it time to introduce ethics into the agile startup model?

Anand Rao Contributor Share on Twitter Anand Rao is global head of AI at PwC . The rocket ship trajectory of a startup is well known: Get an idea, build a team and slap together a minimum viable product (MVP) that you can get in front of users. However, today’s startups need to reconsider the MVP model as artificial intelligence (AI) and machine learning (ML) become ubiquitous in tech products and the market grows increasingly conscious of the ethical implications of AI augmenting or replacing humans in the decision-making process. An MVP allows you to collect critical feedback from your target market that then informs the minimum development required to launch a product — creating a powerful feedback loop that drives today’s customer-led business. This lean, agile model has been extremely successful over the past two decades — launching thousands of successful startups, some of which have grown into billion-dollar companies. However, building high-performing product...

Watch Aidy Bryant *completely* lose it as 'SNL' roasts political pundits

On Saturday Night Live , there are breaks and then there's whatever happened here. The Season 45 premiere featured a sketch that was meant to expose the empty noisemaking of political punditry on TV. But part of the joke involved a series of quick costume changes, and some weirdness during one of those switches led to a complete and total breakdown. Aidy Bryant, the segment's host, couldn't take it. She manages to keep it together until what appears to be an accidental wide shot exposes some of the magic as we see a woman who's probably a member of the SNL wardrobe crew fiddling with Aidy's costume. Read more... More about Saturday Night Live , Aidy Bryant , Entertainment , and Movies Tv Shows from Mashable https://ift.tt/2okrAOq via IFTTT