A federal judge in New York has dismissed a copyright lawsuit filed by the online news outlets Raw Story and AlterNet against artificial intelligence company OpenAI. The lawsuit alleged that OpenAI used articles from these outlets without permission to train its language models, including the popular chatbot ChatGPT.
U.S. District Judge Colleen McMahon stated that Raw Story and AlterNet failed to demonstrate sufficient harm to support their claims. Although the judge allowed the publishers to file a new complaint, she expressed skepticism about their ability to prove a recognizable injury. Matt Topic, attorney for Raw Story, expressed confidence in addressing the court’s concerns through an amended complaint.
The case is part of a broader trend of legal actions targeting AI companies for using copyrighted material to train their systems. Similar lawsuits have been filed by various authors, artists, and media organizations, raising questions about the legality of data scraping by technology firms.
The Lawsuit
Raw Story and AlterNet had accused OpenAI of violating Section 1202(b) of the Digital Millennium Copyright Act (DMCA). This provision protects “copyright management information” (CMI) such as author names and titles, prohibiting their removal or alteration without authorization. The plaintiffs argued that OpenAI removed CMI from their articles while using the content to develop ChatGPT, thereby infringing their copyrights.
Judge McMahon, however, found that the plaintiffs did not adequately prove that the removal of CMI and the use of their articles caused them concrete harm. She noted that the plaintiffs’ claim centered on the exclusion of CMI rather than direct copyright infringement. “The harm cited by the outlets is not the type of harm that has been elevated to a level that would justify the lawsuit,” McMahon commented.
The judge also highlighted the challenges in attributing specific content from AI models to individual sources. She pointed out that the iterative improvements in LLMs make it unlikely for content to be reproduced verbatim, weakening the plaintiffs’ case that their specific articles were directly infringed.
Implications for the AI Industry
The dismissal of Raw Story and AlterNet’s lawsuit is a significant moment in the ongoing debate over AI and copyright law. It raises important questions about the extent to which AI companies can utilize publicly available content for training purposes without explicit permission or compensation.
Similar cases, such as The New York Times’ lawsuit against OpenAI and the Doe 1 v. GitHub case involving Microsoft’s Copilot, have struggled to establish successful claims under existing copyright laws. These rulings suggest that current legal frameworks may not adequately address the complexities introduced by generative AI technologies.