The Scale Is Real

By mid-2025, generative AI had stopped being a laboratory question and become a distribution question. Suno, launched in late 2022, had reached roughly 25 million users. Udio, launched in 2024, drew 600,000 users in its first two weeks and produced tracks at a rate of roughly ten per second. Deezer reported that, by April 2025, more than 20,000 fully AI-generated tracks were uploaded to its platform every day (eighteen percent of all new uploads, nearly double the share from a few months earlier).

Adoption had moved past the curiosity phase. A track generated on Suno and slipped into Spotify playlists by a researcher had accumulated over 64,000 plays before anyone in the curation chain flagged it as synthetic. Audiences were not rejecting AI-generated music when they encountered it. They were not, in most cases, encountering it knowingly.

A 2024 survey of more than 15,000 music creators found that 35 percent had already used some form of AI in their practice, 51 percent among those under thirty-five. The same survey found that 71 percent expected AI’s growth to make earning a living from music impossible. The two findings are not contradictory; they describe the same condition: tools adopted because they work, inside a value system whose collapse is being simultaneously anticipated.


What the Tools Changed

Most accounts of AI music collapse the field into a single image: a prompt, a song, a platform. That image is not wrong, but it is narrow. By 2025 the tool landscape was broader, and structurally more consequential than the consumer-facing apps suggested.

Stem separation, pioneered by open-source work like Deezer’s Spleeter and refined in commercial products, had become routine. The reconstructed Beatles track Now and Then, released in November 2023, was built on AI-assisted isolation of John Lennon’s vocal from a lo-fi 1970s cassette demo. What had been a studio impossibility became a laptop task.

Voice cloning had moved from proof-of-concept to consumer-level imitation in under two years. Grimes’s offer of 50 percent royalties to anyone using an AI version of her voice, and Holly Herndon’s release of Holly+ as a free instrument, represented one response: treat the voice as a licensable asset and try to set terms. The anonymous “Ghostwriter” Drake/Weeknd track, which drew millions of streams before labels forced it down, represented the other: take without asking, and let the legal system figure it out afterward.

Audio inpainting and extension turned generation from a one-shot prompt into an iterative editing medium. Udio’s Extend and Inpaint functions, Sony CSL’s Diff-A-Riff research, and Google Magenta’s accompaniment experiments all let a sketch become a finished track, or a flawed verse be rewritten without touching the rest.

These are not six separate tools with six separate implications; they share a structural effect: every fixed aesthetic choice in a recording (the timbre, the vocal identity, the arrangement, the sectional structure) has become something that can be unbundled, transferred, regenerated, or extended by a system with no connection to the original maker. What was settled in a recording is now editable. What was specific to a performer is now transferable. What was a finished work is now a starting point.

The economic infrastructure of music presupposes that none of this is true.


By 2025 the legal debate had consolidated around a single question: was training on copyrighted music fair use, or did it require a license?

That framing has started to show its limits. Courts have reached for familiar doctrines (reproduction, fixation, substantial similarity) and found them ill-fitted to a process that does not produce the kind of copy those doctrines were designed to regulate. Training involves technical copying, but the copies are transient, not independently usable, and not economically substitutive. The value lies in the statistical aggregation those copies make possible, not in the copies themselves.

Positions hardened anyway. AI developers argued for minimal friction, citing innovation and geopolitical competition. Rightsholders argued for consent and compensation, citing the unambiguous fact of ingestion. Policymakers tried to split the difference: Japan blessed training on protected works by exception; the EU AI Act moved toward transparency requirements and an opt-out architecture; the U.S. left the question to the courts.

The structural problem is that all of these positions still treat the issue as a copyright problem. Copyright is a law of copying. Training a generative model is not, at its economic core, a copying act — it is an act of learning, and the law has no dedicated category for that. Treating training as reproduction stretches the instrument past its design point. Treating it as fair use leaves every structural question — attribution, compensation, participation — unanswered.

The industry is building the ecosystem for the next fifty years of music on top of a doctrinal gap.


What a Working System Would Have to Do

Any system adequate to the moment has to do three things the current one does not.

It has to allocate value at the point of contribution, not at the point of output. Output-based royalty models presume identifiable works, traceable uses, and clean attribution chains. Generative systems do not produce outputs that can be traced cleanly to inputs; the economic weight comes from the distributed statistical influence of the whole corpus, not from any single identifiable ingredient. A working system compensates for contribution, not for imitation.

It has to reward diversity structurally. Models trained on homogenous data collapse stylistically, and the commercial pressure to train on whatever is cheapest to acquire produces exactly that collapse. A system that weights contributions according to what they add to the corpus (especially where the corpus is currently thin) creates the economic conditions for stylistic and cultural variety to be maintained rather than bled out.

It has to make participation auditable. The opacity of current training pipelines is designed in: what was ingested, what was produced, how value flowed. An adequate system logs contributions, records uses, and lets contributors verify what happened to their work without asking permission to see.

None of these are technical impossibilities. They are design decisions the industry has not made, because the pressure to move fast has been stronger than the pressure to build well.


The Chance to Rebuild

There is a version of the next decade in which the question of how AI relates to music is settled by default. The legal framework that happens to cohere first becomes the framework everyone builds on, probably some variant of “fair use for training, output-level claims for similarity.” The companies that arrived earliest keep the infrastructure. Music, inside AI systems, becomes a commodity input with no participation rights and no visible contribution chain.

There is another version in which the question is settled by design. Artists, labels, legal scholars, and engineers — the people who would actually have to live inside whatever system emerges — get involved before the defaults harden. The infrastructure gets built with consent, traceability, and participation as starting points rather than compliance afterthoughts.

CORPUS is one attempt to operate in that second version: a licensing and compensation protocol designed from the ground up for training rather than distribution, with value allocated at contribution and diversity weighted explicitly. It is one concrete effort to show the answer can be built.

The flood is here. Whether what follows is rebuilding or resignation depends on whether the people with something to lose show up in time.