Publishers Sue Tech Giants Over AI Training Data
A new class-action lawsuit filed in California federal court targets Google, Meta, and Perplexity AI, alleging they illegally used copyrighted web content to build their artificial intelligence...
A new class-action lawsuit filed in California federal court targets Google, Meta, and Perplexity AI, alleging they illegally used copyrighted web content to build their artificial intelligence systems. The suit, brought by a group of website operators, claims the companies scraped text and images from thousands of sites without permission or payment to train models like Gemini, LLaMA, and Perplexity's answer engine.
The legal action argues this systematic data collection violated copyright law and the Computer Fraud and Abuse Act, with plaintiffs alleging crawlers ignored standard publisher restrictions like robots.txt files. The complaint suggests Perplexity's model is particularly problematic, generating summaries that closely mirror original articles, potentially eliminating user visits to source websites.
This case arrives during mounting legal challenges to AI training practices. Unlike suits from individual publishers or authors, this action represents smaller website operators who collectively form a significant portion of web content. The defendants will likely argue their use constitutes fair, transformative application of the material. However, plaintiffs counter that AI-generated answers directly substitute for original content, harming publishers' traffic and revenue.
The outcome could influence how AI companies gather training data, potentially moving the industry toward licensed content agreements. For publishers, the case addresses a fundamental threat: AI systems using their work while diverting the audience that supports its creation. As the legal process begins, the suit underscores growing tension between rapid AI development and the rights of content creators.
Source: Webpronews
Ready to Modernize Your Business?
Get your AI automation roadmap in minutes, not months.
Analyze Your Workflows →