Skip to main content

Glossary

Action Hub

The inbox in the contributor app at app.corpus.music where every flag, agreement confirmation, and admin notification surfaces. The point: nothing about a contribution happens silently. See Your Dashboard.

Annotation pipeline

The proprietary system that generates the structured descriptions every track in CORPUS carries — instrumentation, tempo, key, mood, vocal treatment, narrative axes. Layer 3 of the three-layer architecture. The annotation layer is what makes the corpus operational for both model training and search. See The Semantic Layer.

C2PA (Content Credentials)

An open standard for attaching tamper-evident provenance metadata to media files. CORPUS plans to embed C2PA credentials in model outputs so downstream verifiers can read the provenance directly from the artefact. See Audit Trail.

Catalog Intelligence

The CORPUS B2B offering: the annotation and search stack applied to a partner's own catalogue, delivered as a per-track setup fee plus a recurring search-API tariff. Aimed at catalogue owners (publishers, libraries, sync agencies) and at platforms that embed music. See Catalog Intelligence.

CMO (Collective Management Organization)

Organisations such as GEMA, SACEM, or PRS that license copyrighted works at scale for public performance, broadcasting, and mechanical reproduction. CMO frameworks were not designed for AI training, and CMO strategies on AI training diverge sharply across jurisdictions. See DSM Directive and the TDM opt-out.

Commercial Entity

In the three-entity structure under design, the company that develops and operates the AI technology, the semantic pipeline, and the commercial products. External investment sits here; the Foundation's Golden Share prevents commercial pressure from overriding contributor protections. See The Foundation.

Confidential computing

A class of secure-compute technologies (for example NVIDIA secure enclaves) that prevent training data from being inspected or exfiltrated even during active computation. CORPUS plans to integrate confidential computing into its training stack. See How Contributions Are Protected.

CRPS (Corpus Participation Rights)

A lasting stake in the CORPUS system, accumulated alongside royalty points. Royalties reflect current income; CRPS represent the historical fact of having helped build the corpus. Once issued, they do not expire. The legal form is still being evaluated. See CRPS.

DDEX

An industry standard for exchanging music metadata between distributors, labels, and rights organisations. CORPUS can import contributor metadata via DDEX when integrating with existing distribution flows. See From Platform to Protocol.

Diversity bonus

Additional scoring weight for contributions that expand the corpus into underrepresented areas, assessed relationally — against the existing library at the moment of ingest. Protected for five years, then subject to gradual temporal decay toward a permanent floor of 30%. See Temporal Dynamics.

DSM Directive

The EU Directive on Copyright in the Digital Single Market. Article 4 allows commercial text- and data-mining only where rights holders have not exercised the TDM opt-out. See DSM Directive and the TDM opt-out.

Dual-track governance

A separation under design between an executive track (operational decisions in days) and a constitutional track (parameter changes deliberated over months). The boundary is intended to be enforced technically, not just contractually. See Dual-Track Governance.

EU AI Act

European regulation (Regulation (EU) 2024/1689) governing AI systems and general-purpose AI models. Article 53 requires GPAI providers to publish a sufficiently detailed summary of training content and to maintain a copyright policy. See EU AI Act.

Fair Use

A US legal doctrine permitting limited use of copyrighted material without permission. Its application to AI training is legally contested; Warhol v. Goldsmith and related cases have tightened the four-factor test. Relying on Fair Use for training data is a bet that case law will reverse direction.

Federated learning

A training approach where data never leaves the data holder's infrastructure. Licensees submit model architectures to a secure environment; only the resulting model weights are exported. CORPUS's default at launch is to license CORPUS-trained models; federated training is planned as an option for partners with their own architectures and is still under development. See Access Models.

Generative Output Bonus

A defined share of the revenue of a generated output, distributed to contributors whose works occupy the sonic neighbourhood of that output — weighted by similarity and by originality within the corpus. An allocation rule based on measured statistical proximity: the payout follows from the contributor agreement and implies no claim that the output copies or derives from anyone's work. See How Royalties Flow.

Golden Share

A veto right held by the Foundation over strategic decisions in the commercial entity — sale of the corpus, change of fundamental licensing terms, redirection of the royalty pool. The mechanism by which contributor protections survive commercial pressure. See The Foundation.

Input attribution

Valuing contributions at the point of entry into the training pipeline — based on how they enrich the dataset — rather than tracing back from generated outputs. CORPUS's approach. See The Input-Side Shift.

IP Entity

In the three-entity structure under design, the legal entity owned by the Foundation that holds all music contributions, manages the dataset, and administers scoring and CRPS. Contributors' legal relationship is with this entity. See The Foundation.

IPI Name Number

The Interested Parties Information identifier issued by CMOs to rights holders. Contributors who belong to a CMO are asked for their IPI number during the contributor application.

ISCC (International Standard Content Code)

An open-standard identifier that generates a content-derived code from a media file, enabling cross-platform identification and provenance tracking. CORPUS plans to embed ISCC codes in model outputs alongside C2PA credentials. See Audit Trail.

KYC

Know-Your-Customer procedures contributors complete before any royalty payment is disbursed — standard practice for any system that pays money to identified individuals. See How Royalties Flow.

MIR (Music Information Retrieval)

Automated analysis techniques for evaluating musical properties — spectral balance, dynamic range, tempo, key, harmonic content. Used in the planned production quality assessment.

Output attribution

Attempting to trace AI-generated output back to specific training data. Current research shows this is not reliably possible for generative models; the influence of any individual training work on a given output is mathematically indistinguishable from noise. This is why CORPUS uses input attribution instead.

Personality rights

Rights protecting an individual's voice, likeness, and persona — distinct from copyright in the underlying composition or recording. For vocal tracks, personality-rights consent from the named singer is required at upload. CORPUS makes no exceptions. See Personality rights and vocal performance.

The CORPUS search mode for queries with explicit sonic constraints — genres, moods, instruments, key, vocal treatment. One of three search modes on the semantic pipeline.

Provenance hash

A cryptographic fingerprint that links a contribution to its source, enabling verification across the pipeline from upload through training to deployed model. See Audit Trail.

Scoring jury

A contributor panel selected by stratified sortition to review and calibrate the scoring system. Recommends recalibration of scoring dimensions or weights; sets constraints, does not modify the algorithm directly. See The Scoring Jury.

Semantic description

The 300-to-500-word structured description CORPUS writes per track across four dimensions: what the music is, what it does, where it belongs, and how it would work as potential film music. The methodology behind the annotation pipeline; the alternative to compressing a track into categorical tags. See The Semantic Layer.

Semantic layer

CORPUS's proprietary annotation system — the layer that produces structured descriptions of each track and powers the three search modes. Layer 3 of the three-layer architecture. See The Semantic Layer.

Sortition

Selection by stratified lottery rather than election. Used for the scoring jury because elections in networks with power-law participation reproduce that topology; lottery breaks it. See The Scoring Jury.

Stiftung (Corpus Foundation)

The dedicated foundation under design that would own the IP Entity, hold a Golden Share in the Commercial Entity, and have no shareholders. Its governing council (Stiftungsrat) would include representatives elected by the contributor community. See The Foundation.

Story2Music

The narrative search mode at intelligence.corpus.music: the user describes a scene, the system returns three dramaturgically distinct musical directions, each with retrieval from the corpus underneath. See Three Modes of Search.

TDM opt-out

The reservation mechanism under Article 4 of the DSM Directive by which rights holders can exclude their works from commercial text- and data-mining. CORPUS sidesteps the opt-out question by licensing through explicit, AI-specific contributor opt-in. See DSM Directive and the TDM opt-out.

Temporal decay

The gradual reduction of the diversity bonus component of a contribution's score after a five-year protection period. Decays asymptotically toward a permanent floor of 30% — never zero. Only the diversity component decays; quantity and quality do not. See Temporal Dynamics.

Three-layer architecture

The structural separation behind CORPUS: an open protocol (Layer 1, trust through transparency), controlled data access through federated learning (Layer 2, protection through enforcement), and the proprietary semantic pipeline (Layer 3, innovation through incentive). See The Three-Layer Architecture.

Visitor vs Contributor

The two-tier onboarding in the contribution app. Visitors create an account, upload tracks, and see them run through the annotation pipeline — their uploads do not enter the library. Contributors are vetted accounts whose uploads enter the library and become eligible for scoring, royalties, and CRPS. See How to Join.