The Legal Feed / Random / Nearly 400 Local Newspapers Sue Microsoft, OpenAI Over Alleged Theft of Journalism to Train AI

Nearly 400 Local Newspapers Sue Microsoft, OpenAI Over Alleged Theft of Journalism to Train AI

Share

Thirty-four newspaper companies that own nearly 400 local and regional publications across 33 states sued Microsoft Corp. and OpenAI in Manhattan federal court on Wednesday, alleging the companies unlawfully copied hundreds of thousands of copyrighted news articles to train ChatGPT and Copilot without permission or compensation.

The complaint, filed June 24 in the Southern District of New York, alleges Microsoft and OpenAI systematically scraped publishers’ websites, including paywalled content, copied articles onto company servers, stripped author credits, copyright notices, and other ownership information, then used the material to train large language models. The publishers assert claims under the Copyright Act and the Digital Millennium Copyright Act, seeking statutory damages and an injunction requiring the defendants to remove their works from every model and training dataset. Richner Communications Inc. et al. v. Microsoft Corp. et al., 1:26-cv-05320 (S.D.N.Y., filed June 24, 2026). Complaint can be found here

Lead plaintiff Richner Communications Inc., the Garden City publisher of the Long Island Herald, is joined by the Arkansas Democrat-Gazette, the Santa Fe New Mexican, The New York Amsterdam News, and dozens of other publishers. Many are family-owned businesses, and some have been published for more than a century. They are represented by Matthew J. Platkin’s firm, Platkin LLP, of Belleville, New Jersey.

The publishers cast the dispute in constitutional terms, arguing Congress has protected authors’ exclusive rights since the nation’s founding and that the defendants rely on those same protections to shield their own code and models. “In bringing this action, the Publishers seek to hold Defendants to the same standard they insist upon for themselves,” the complaint states.

Central to the case is the allegation that the defendants acted willfully. The publishers cite OpenAI CEO Sam Altman’s testimony before the British House of Lords, in which he conceded it would be “impossible to train today’s leading AI models without using copyrighted materials.” They also allege Microsoft built and operated a dedicated Azure supercomputer exclusively for OpenAI and was “intimately involved in every step” of model training.

The publishers contend the scope of the alleged infringement is documented, not speculative. A technologist working for their counsel found an open-source approximation of OpenAI’s WebText dataset contained millions of tokens drawn from tAI storyhe publishers’ websites, and that C4, a filtered subset of a 2019 Common Crawl snapshot, contained more than 115 million tokens of their content. Ogden Newspapers alone accounted for more than 71 million tokens.

The complaint devotes particular attention to the DMCA claim, alleging OpenAI used two text-extraction tools, Dragnet and Newspaper, because they separate article text from the “chrome” containing copyright notices and terms of use. By removing that information, the publishers argue, the defendants severed the connection between the content and its owners while concealing the alleged infringement.

The publishers characterize the stakes as existential. They allege the defendants divert traffic that otherwise would reach their websites, reducing subscription and advertising revenue while undermining the licensing market at a time when local journalism already operates in what the complaint calls “a fragile economic environment.” Citing Pew Research Center data, they note that 71% of Americans say their local news outlets report the news accurately.

The complaint notes that OpenAI has a post-money valuation of $852 billion and has confidentially filed for an initial public offering in June 2026. Microsoft, OpenAI’s largest shareholder, has invested roughly $13 billion.

The publishers acknowledge their claims mirror lawsuits already pending before the same court, including actions brought by The New York Times, the New York Daily News, and the Authors Guild that have been consolidated into multidistrict litigation. They emphasize those cases survived motions to dismiss “largely intact,” arguing this is not “a case of first impression.”

The publishers demand a jury trial.


Share