Meta Platforms (META) shares edged slightly lower in early trading after a group of major publishers filed a sweeping lawsuit accusing the company of using copyrighted books and academic materials without permission to train its Llama AI models.
The legal action, filed in Manhattan federal court on May 5, has intensified concerns around how leading tech firms source training data for generative artificial intelligence systems.
The plaintiffs include major academic and publishing houses such as Elsevier, Cengage, Hachette, Macmillan, and McGraw Hill, alongside author Scott Turow. The case alleges that Meta used millions of protected works, including textbooks, scientific research papers, and well-known novels, to develop its AI systems without securing proper licensing agreements.
According to the complaint, the training datasets allegedly included a wide range of copyrighted materials spanning educational textbooks, scientific literature, and fiction titles. Among the cited works are N.K. Jemisin’s The Fifth Season and Peter Brown’s The Wild Robot, highlighting the breadth of content said to have been incorporated into Meta’s AI training pipeline.
Meta Platforms, Inc., META
The publishers argue that the unauthorized use of these works violates intellectual property rights and are seeking damages as well as broader legal protections for content owners whose materials may have been used in AI development without consent.
The lawsuit adds further pressure to the growing global debate over whether the use of copyrighted material for AI training qualifies as “fair use.” Tech companies, including Meta, have repeatedly argued that training AI models on large-scale datasets is transformative and falls under fair use protections.
However, creators and publishers strongly disagree, claiming that such practices effectively replicate protected content without compensation. The case joins similar lawsuits against OpenAI and Anthropic, signaling a widening legal front in the AI industry. A recent settlement involving Anthropic, worth approximately $1.5 billion, has already shown that courts may distinguish between lawfully sourced data and pirated material in future rulings.
Beyond the courtroom claims, earlier reporting has highlighted additional concerns about Meta’s data sourcing practices. Court filings in related cases suggest the company may have accessed datasets from shadow libraries such as LibGen and Anna’s Archive, with allegations that tens of terabytes of data were obtained through torrenting channels.
Internal discussions reportedly revealed concerns among researchers and engineers about the ethical implications of using such datasets. Some employees allegedly raised objections to the use of pirated material, while others debated whether licensing individual works would undermine Meta’s broader fair use defense strategy.
The post Meta (META) Stock; Edges Lower as Publishers Sue Over Llama AI Training Data Claims appeared first on CoinCentral.


