AI Copyright Chaos
What the future might look like


Adrian Parlow
·
Co-Founder & CEO
March 20, 2025

Why AI and Copyright Law Is a Minefield
The more I dig into the Thomson Reuters v. ROSS Intelligence case, the more I realize the potential implications it could have on the future of AI.
For context…
In May 2020, Thomson Reuters sued ROSS Intelligence for using Westlaw’s copyrighted headnotes — summaries of key points from court opinions — to train its AI-powered legal search engine. Last month, Judge Stephanos Bibas (who, once upon a time, was one of my law school professors) determined that ROSS infringed on 2,243 headnotes, rejecting their fair use defense because ROSS’s use was commercial, aimed at competing with Westlaw, and did not create anything new.
I don’t claim expertise on whether the ruling is legally correct, but I do have thoughts on what it could mean for the AI industry.
One of the Biggest Questions in the AI Industry
One of the biggest question marks hanging over the AI industry right now?
How do we handle copyright?
If you take a step back and look at most large language models — OpenAI, Anthropic — they blatantly train their models using data without permission.
Remember when the New York Times sued OpenAI and Microsoft over A.I. use of their copyrighted work back in 2023?
What happens in Thomson Reuters v. ROSS — along with additional case law that follows — will play a major role in shaping AI development in the future.
If we end up with a restrictive ruling (like we currently have):
Content holders will require licenses to use their data, making it expensive to develop anything AI-related. The legal research duopoly—we talked about last week —will strengthen, making it even more challenging for startups to break into the market. AI labs will slow down because of all the legal red tape and inevitable licensing costs.
However, if we end up with a more lax ruling:
New players could continue scraping data from the internet and innovating. Content holders will have to get creative or create deals with builders. AI development will speed up overall.
Bottom Line
We’re probably in for more fights like ROSS — courts will keep hashing this out, and it’ll set the stage for the future of AI.
However, if things keep trending in this direction, it will inevitably make it harder for incumbents to break into the industry.
Earlier this week, I was chatting with one of the most prominent up-and-coming legal tech startups. They are currently struggling to get legal research data because Harvey’s locked up Lexis (via the investment from their parent company RELX), Thomson Reuters has a competing product in CoCounsel.
So what are their options?
Their only options are to spend years building up the data from scratch or to scrape it from others opening them up to being sued out of existence. Not a great set of choices.
For the broader AI industry, I hope we don't end up in a world where scraping data to train AI models constitutes copyright infringement.
This is a challenging balance to strike. While content owners deserve to be compensated for the use of that content, an overly restrictive definition of infringement risks slowing down the entire industry. In my opinion, we need a balance that keeps innovation alive while fairly compensating data owners. What do you think?