How Contributions Are Protected
CORPUS is being built in the open. Some of what you read here is live, some is still design intent — expect it to evolve.
Contributions enter CORPUS through encrypted upload channels and are stored on self-administered servers in Germany, under EU data protection law. The infrastructure is built to avoid the two failure modes that have defined most music-AI data handling so far: opaque cloud hyperscaler dependencies, and training environments in which the data is visible to the provider.
Encrypted uploads, EU-jurisdiction storage
Every file is encrypted in transit. Storage runs on self-administered servers in Germany rather than on opaque cloud hyperscalers, so the governance chain — who can read what, under which law — stays explicit.
As the corpus grows, this evolves into a federated global server network with nodes in strategic locations, each operating under the same standards. Storage, training, and audit records remain within EU jurisdiction unless an engagement specifies otherwise.
Federated learning keeps the data inside CORPUS
Raw audio cannot be recalled once released. At launch, CORPUS therefore does not ship datasets to licensees: the default path is to license CORPUS-trained models. Where partners genuinely need to train their own architectures, the planned federated route would keep the data inside CORPUS infrastructure: the licensee would submit a model architecture, training would run on our servers, and only the resulting model weights would leave. The data stays controlled technically, not just contractually. This route is under development and will be piloted with initial industry partners before becoming a standard path; see What Remains Open.
This is Layer 2 of the protocol; the architectural rationale sits in The Three-Layer Architecture, and the licensee-facing mechanics are described in Access Models.
Confidential computing for training
For training runs, CORPUS plans to integrate confidential computing technologies such as NVIDIA's secure enclaves. These ensure that even during active computation, training data cannot be inspected or exfiltrated by providers — closing one of the most sensitive attack vectors in AI workflows.
Security framework alignment
The infrastructure is aligned with recognised security frameworks:
- ISO 27001 for risk and information security management.
- SOC 2 for audited controls on secure operations.
Formal certification will follow at commercial scale. Alignment from the outset is what ensures compatibility with the procurement requirements of partners in regulated industries and with provisions under the EU AI Act.
For the regulatory framing — EU AI Act, DSM Directive, personality rights — see Compliance and Regulation.