Architecture and requirements

Index to Écluse's systems design: the vision, how a request flows, and what is out of scope. Each concern's detailed design lives under architecture/. Development practices, layout, testing, and CI live in ../CONTRIBUTING.md; the why is in ../MOTIVATION.md. This document and its links are the how.

These documents describe the target design, not necessarily the current code. Implementation tracks toward them; check git and the planning/ DAG for what has shipped.

Vision

Supply-chain attacks through malicious or hijacked publications are a growing threat in high-volume ecosystems like npm. Écluse (package ecluse) is a lightweight proxy between consumers (developers, CI) and the upstream registry that applies a configurable policy before any package reaches a build, without hosting packages itself.

The name is French for a canal lock: the controlled passage every dependency clears before it reaches a build. The goal is resilience, mitigating the blast radius of a bad publish, not malware detection.

Écluse is not a registry. It delegates storage to the operator's backend (e.g. AWS CodeArtifact or GCP Artifact Registry) and enforces policy on what may be fetched and mirrored from the public registry.

Codebase decomposition

Écluse builds as two libraries behind one ecluse.cabal, splitting the pure capability core from the composition shell:

The build graph enforces the boundary: the core's unit suite does not depend on the application library, so a core module reaching into composition fails to compile. ecluse.cabal is the authoritative component and module map.

Request lifecycle

The three request shapes use the upstreams differently: a tarball falls back, a packument merges, and a publish writes through.

flowchart TD
    C(["Client request"]) --> K{"packument, tarball, or publish?"}

    K -->|"tarball"| T1["Fetch from private upstream"]
    T1 -->|"2xx hit"| TSV(["Stream unfiltered. Done."])
    T1 -->|"miss"| T2["Fetch version metadata from public<br/>+ evaluate rules (deny by default)"]
    T2 -->|"Denied / Unavailable"| TD(["403 / 503 / 500. Done."])
    T2 -->|"Admitted"| T3["Stream from public + enqueue mirror job<br/>(non-blocking)"]
    T3 --> TSV2(["Serve immediately. Done."])

    K -->|"packument"| P1["Fetch private + public in parallel"]
    P1 --> P2["Trust private versions;<br/>gate public versions (rules, deny by default)"]
    P2 --> P3["Merge (private wins; flag divergence),<br/>filter, repoint latest"]
    P3 -->|"survivors"| PSV(["Serve merged packument. Done."])
    P3 -->|"none survive"| PD(["403 / 503. Done."])

    K -->|"publish (PUT)"| W1{"ECLUSE_MOUNTS__NPM__PUBLICATION_TARGET set?"}
    W1 -->|"no"| W405(["405 Method Not Allowed. Done."])
    W1 -->|"yes"| W2["Enforce publish-scope allow-list<br/>(anti-shadowing)"]
    W2 -->|"out of scope"| WR(["4xx, no upstream write. Done."])
    W2 -->|"in scope"| W3["Write to publication target<br/>(client token forwarded)"]
    W3 --> WSV(["npm success. Done."])

Document map

Document Covers
Diagrams Mermaid visual companion: system overview, packument / tarball / worker sequences, rules and credential lifecycles.
Registry Model The four registry roles (two reads, two writes) and the RegistryClient handle.
Internal Domain Model PackageDetails and the ecosystem-agnostic signals the rules engine consumes.
Web Layer Raw-WAI front door: routing and multi-ecosystem mounts, the control/data-plane split, streaming, middleware, and graceful shutdown.
API Surface & Capability Manifest The OpenAPI capability manifest and the synthesised-packument schema.
Rules Engine & Responses Deny-by-default evaluation, the rule tiers, the CVE subsystem, and denial responses.
Cloud Backends & Mirroring The mirror queue and the two cloud handles (MirrorQueue, CredentialProvider); AWS and GCP.
Configuration & Authentication Environment config, outbound registry credentials, and inbound client auth.
Access & Credential Model The per-mount credential strategy (passthrough / service), edge auth, and the no-private-cache posture.
Security Invariants Outbound-request and input-validation defences, canonicalisation, the host allowlist, internal-range blocking, response bounds.
Threat Model The STRIDE register, generated from the Threat Dragon model (threat-modelling/ecluse.json); the single source of truth for the system's threats.
Observability Opt-in OpenTelemetry/OTLP tracing and metrics; Datadog optional.
Technology Stack Library choices and the key cross-cutting decisions.
Release & Supply-Chain Operations The reproducible OCI image, the publish/attest chain (provenance + SBOM), Docker Hub tokens, and CVE and freshness scanning.

Out of scope (for now)