Licensing AI music: the industry is focusing on the wrong problem
Licensing AI outputs misses the point; the real issue is training, which uses vast amounts of copyrighted material without permission.
Title image source: www.musically.com
The wrong layer of the problem
Virginie Berger
The music industry is focusing on the wrong layer of the problem. The central debate around generative AI is framed in terms of outputs, who owns a song, who gets credited if a melody sounds similar. But generative AI doesn’t work like human inspiration. These systems are not inspired by individual songs; they’re trained on massive datasets, ingesting thousands or millions of tracks to learn musical patterns, structures, and styles.
This means that every output is a reflection of everything the model has been trained on, not a direct copy of one input, but a statistical remix of all prior ingestion. Licensing one song here and there after the model is built misses the point entirely. The damage happens during training, not when an output is published.
And here’s the reality: scraping millions of tracks is not technically necessary. According to David Hughes (former CTO of the RIAA and Chief Strategist of the AI:OK initiative), who I spoke to while writing this article, experiments with GenAI music systems showed they could perform effectively across styles and genres using less than 100,000 high-quality tracks. These tests, conducted with a model developed to create background and ambient music for use in public venues demonstrated that a representative sample of music, not exhaustive scraping, is enough for strong performance.
As Hughes also reminded me, Pandora once had only 1.6 million tracks in its catalog, and less than half of those generated over 90% of the platform’s revenue. The reality is that a tiny percentage of commercial music accounts for the vast majority of listening and revenue, and generative AI models don’t need access to all of it. In fact, a small, well-curated and well labeled fraction of one percent of commercially released music could be sufficient to train capable systems.
So why do Generative AI companies scrape the entire internet? Not to improve model performance, but to build a legal shield. If they admitted they only needed a fraction of commercially valuable music, they’d be forced to negotiate licenses for that subset. Instead, they claim they “need it all” to justify mass scraping and avoid paying for what they use. It’s a strategic defense, not a technical requirement.
“A fraction of one percent of commercially released music could be sufficient to train an AI model”
Virginie Berger
Why Licensing Outputs Is a Dead End
Generative AI models don’t forget. Once trained, they reuse absorbed patterns endlessly, generating outputs at scale that directly compete with human-created music. Licensing outputs after training is too late, the damage is done.
Even if rights holders recover $50 million in back licensing fees, they might lose $500 million in future income, from streaming, sync deals, or fan engagement. Deezer has already become a case study: they revealed that 18% of all music uploaded to streaming is fully AI-generated, cannibalizing artist streams and reducing music to disposable background content.
Why the Value of Music Is Collapsing
Music is being treated not as art but as a data source. Once a model is trained, artists face infinite synthetic competition. And now, with the rapid development of source separation tools, AI systems no longer need isolated stems to learn or generate from. They can extract individual elements, vocals, drums, melodies, from fully mixed tracks and use them to create new works.
This process happens silently, without detection, and without consent. A vocal line or beat can be pulled from a hit single, embedded in training, and then recombined into outputs with no traceable origin. That makes the traditional idea of licensing individual recordings or stems entirely obsolete. Source separation makes every mixed song a potential dataset, available for silent reuse, remixing, and deconstruction. And the industry, for the most part, is acting like it doesn’t see this coming.
“The traditional idea of licensing individual recordings or stems [may be] entirely obsolete”
Virginie Berger
The Real Problem Is at the Training Stage
We’re still stuck in the wrong frame. The music industry keeps talking about authorship, outputs, melodic similarity, as if that’s where the value transfer occurs. But the real transfer happens during training.
These systems are trained on billions of tokens. In AI, a token is the smallest unit of meaning, like a syllable in audio or a fragment of notation. If 15% of those tokens come from copyrighted music, then those works have materially shaped the model’s capabilities. That influence is measurable, and it’s valuable.
Compensation should not be based on whether an output sounds similar to a particular song or used a particular song. It should be based on how much a given work contributed to the training process. This isn’t about tracing a melody, it’s about acknowledging the industrial-scale ingestion of creative labor.
Once a system has absorbed that labor, it can generate endless outputs, whether or not they resemble the original. That’s why attribution models (influence functions, embeddings, watermarking) fall short: they chase similarity or ownership instead of recognizing contribution.
A Better Model: Compensation by Ingestion, Not Resemblance
We need to treat training data as a resource, not a free-for-all. That means building frameworks that reflect how GenAI actually works, not how we wished it worked. Licensing must happen at the point of ingestion, before models are trained. Token-based accounting should track how much protected content is used, and micropayments should reflect the value and influence extracted during training, not just the outputs.
We already have a precedent for this: the private copy levy. When individual copying couldn’t be tracked, rights holders created a compensation system based on volume and behavior, like when people copied music onto tapes or CDs. Instead of chasing every copy, a flat fee is collected based on behavior or device sales, and redistributed to creators.
The same logic must now apply to AI. If we know models are trained on protected content, we don’t need to trace every melody, they should pay based on what they ingested. And a share of AI-related revenues should be redirected into cultural funds that support real human creators, not just the largest rights holders.
But even if we create a licensing system for ingestion, a new challenge appears: what guarantees do we have that models trained on licensed works won’t have their capabilities quietly embedded into other models? Post-training reuse is a real and unresolved issue. Once trained, model knowledge can be exported, fine-tuned, or used as foundational layers for new systems, without needing to relicense the original data. So unless strong auditing and transparency standards are enforced, even licensed ingestion risks becoming a one-time payment for infinite downstream exploitation.
“If we know models are trained on protected content, we don’t need to trace every melody, they should pay based on what they ingested”
virginie berger
We Also Need Platform Responsibility and Structural Reform
This means platforms must label AI-generated content clearly. Audiences deserve to know whether they’re listening to music made by a person or generated by a machine. We should also use existing tools like ISCC (International Standard Content Code) to create traceability and transparency. The ISCC is a unique digital fingerprint that identifies a piece of content, like a song, video, or image, based on its actual data, not its title or creator. It works like a smart ID that can track and match content across platforms, even if it’s been slightly changed or renamed.
This current research around content similarity is mostly being done by developers on a voluntary basis, and it is astonishing that rights holders haven’t invested in these standards more proactively. Same with tools like HarmonyCloak, developed by researchers from the MoSIS Lab at the University of Tennessee, Knoxville and designed to make musical files essentially unlearnable to generative AI models. The music industry should heavily invest into these developments.
The U.S. legal system is also beginning to wake up. In a copyright lawsuit involving Meta, U.S. District Judge Vince Chhabria stated that Meta was “destined to fail” if the plaintiffs could show that AI tools trained on copyrighted works were producing outputs that undermined the market for those works. As he put it: “You have companies using copyright-protected material to create a product that is capable of producing an infinite number of competing products.” And he directly challenged Meta’s defence of ‘fair use’, saying, “I just don’t understand how that can be fair use.”
The implication is clear: the courts are starting to recognise that the core harm occurs during training, not just in the outputs. But the industry itself must act faster.
Protect the Inputs or Lose the Industry
This isn’t about influence or inspiration in the human sense. This is about industrial reuse of creative labor at scale, without consent, compensation, or visibility. If we keep licensing outputs after the fact, we are not regulating GenAI, we’re endorsing it. A bit like endorsing AI training without consent with the opt-out.
The music industry must shift focus now, toward regulating ingestion, labelling AI content, and investing and creating payment systems that reflect influence, not just surface-level similarity.
If we fail to protect the training stage, every artist will eventually compete with a system trained on their own work. And no creative economy, no matter how strong, can survive that.
Article Originally Posted on www.musically.com