Why AI's Next Legal Battlefield Isn't Copyright

The Quiet Shift: Beyond Copyright in AI Lawsuits

Over 70 infringement lawsuits by copyright owners against AI companies have dominated headlines since 2023. But a sharper look at recent case filings reveals something more dangerous for AI builders: courts are beginning to recognize claims that transcend copyright—data misappropriation, trade secret theft, and violations of confidentiality obligations.

A proposed class action against Figma alleges the company used customers' design files to train AI without consent, focusing on misappropriation of confidential information and broken data promises rather than pure copyright. This case matters because it doesn't ask "is AI training transformative?"—it asks "did you violate our trust?"

That's a different legal game entirely. And it's accelerating.

The Copyright Wars Are Still Raging—But Fractured

Let's be clear: copyright litigation is far from over. Universal Music Publishing Group, Concord Music Group, and ABKCO Music filed a $3.1 billion lawsuit against Anthropic on January 28, 2026, alleging Anthropic built Claude AI on a foundation of torrented piracy. Writers including Pulitzer Prize-winning journalist John Carreyrou filed a copyright lawsuit accusing six AI giants of using pirated copies of their books to train large language models.

But the courts themselves are fractured on fair use. In June 2025, in Bartz v. Anthropic PBC, Judge William Alsup granted summary judgment for Anthropic, holding that certain use of copyrighted materials to train Anthropic's large language models called "Claude" was lawful as a "spectacularly" transformative fair use. Yet the court granted summary judgment in favor of Thomson Reuters, finding that the headnotes were original and protected and that Ross Intelligence's use of the headnotes to train its AI legal research tool was not fair use.

These decisions point to a deeper issue: fair use for AI training depends heavily on how you got the data. This is where trade secret and confidentiality claims explode into focus.

Data Provenance as the New Liability Layer

Data provenance and licensing will decide many copyright disputes. But it's not just a copyright question—it's a trust and confidentiality question.

A proposed class action against Figma alleges the company used customers' design files to train AI without consent, focusing on misappropriation of confidential information and broken data promises rather than pure copyright. OverDrive v. OpenAI accuses OpenAI of trademark infringement for naming its video model "Sora" in a way that allegedly conflicts with OverDrive's existing "Sora" library app.

These aren't peripheral cases. They represent the frontier of AI liability in 2026:

Trade secret theft claims expose AI companies to claims that they misappropriated nonpublic information shared in confidence
Broken data promises target companies that explicitly agreed not to train models on customer data, then did it anyway
Trademark conflicts emerge when AI models generate outputs that infringe existing brands or protected identities

Why Data Provenance Will Matter More Than Fair Use

Courts will focus more on how data is gathered, such as whether the data is pirated, or if the training data violated some forms of contractual agreements.

This is critical. The Anthropic settlement demonstrates the financial stakes: Because statutory copyright damages can reach US$150,000 per infringed work, Anthropic's exposure theoretically extended into the hundreds of billions of dollars, raising existential risk for the company. But statutory damages assume infringement is proven. Trade secret and confidentiality claims don't require proving copyright ownership—they require proving breach.

And breach is easier to prove than fair use.

In practical terms, this means:

AI builders must document data sources meticulously. If you can't prove you obtained data legitimately, you're exposed.
Customer contracts now matter more than copyright law. If a customer's terms say "don't train on our data," that's a contractual obligation, not a fair use question.
Confidentiality agreements are becoming IP tools. Companies like Reddit are profiting precisely because they've negotiated explicit licensing deals, not relying on fair use exemptions.

The Licensing Pivot: Why Settlements Are Just Beginning

The biggest lawsuit development of 2025 was the $1.5 billion settlement in the Bartz v. Anthropic case—a case in which Anthropic faced a potentially massive statutory damages penalty for downloading millions of pirated copies of works it used for training.

But licensing is accelerating faster than litigation. Warner Music Group and Suno announced that the parties settled a lawsuit, with Suno launching an entirely new model in 2026 that consists of "more advanced and licensed models" while current models will be phased out.

This pattern reveals the 2026 playbook: litigation → settlement → licensing → profitability.

The key insight? The AI copyright issue around mass-scale data scraping will be effectively resolved through a combination of private settlements, licensing deals, and micropayments. This isn't about copyright anymore—it's about negotiating fair rates and building trust.

The Supreme Court's Authorship Ruling Changes the Game

One more critical piece: The US Supreme Court denied certiorari on March 2, 2026, thereby reaffirming human authorship as a foundational requirement of US copyright law, even in this era of rapid AI advancement.

This has implications beyond copyright. If you want IP protection, there must be a human in the process. Careful documentation of human contributions to the creative and inventive process is more critical than ever.

For builders of AI, this means: if your system generates copyrighted output, you have no copyright protection for that output. That shifts liability questions backward to the training stage, where data provenance becomes paramount.

What This Means for AI Companies in 2026

Build a documented data-provenance strategy. It is the time to professionalize your AI compliance program.

The companies winning in 2026 aren't betting on fair use. They're:

Licensing explicitly. The New York Times' deal with Amazon is reportedly worth $20 million to $25 million. Google's deal with Reddit covers use of Reddit's user-generated content for training its Gemini models.
Building transparency mechanisms. Figma's lawsuit has forced the industry to ask: what data are you using, where did it come from, and did you have permission?
Documenting human involvement. If a human provides significant creative input—such as editing, arranging, or selecting AI-generated elements—a work might be eligible for copyright protection. The extent of human involvement and the level of control exerted by human creators are crucial factors.

Key Takeaways

Copyright battles are shifting ground. Fair use is contested, but data provenance and confidentiality breaches are increasingly actionable. The Figma case signals courts are willing to hear non-copyright claims.
Settlements are setting rates, not resolving principles. There is no sign of training data licensing agreements between large rights owners and AI companies slowing down anytime soon. Expect more licensing deals than final court verdicts.
Trade secrets and confidentiality are the new liability vector. If you promised customers you wouldn't train on their data, that's a contract violation—no fair use exception applies.
Data provenance now outweighs fair use doctrine. Courts are likely to clarify how they distinguish 'transformative' training from substitutive uses, especially when models are general-purpose rather than direct competitors. But the easiest win for plaintiffs is proving you got the data illegally.
Human authorship requirements raise the stakes. Human-contributed AI-assisted works may be protectable, but Dr. Thaler explicitly disclaimed any human creative input, so the court did not address how much human involvement is necessary. AI builders must document human contributions or lose IP protection entirely.