| Safe Haskell | None |
|---|---|
| Language | GHC2021 |
Ecluse.Proxy
Description
Écluse (package ecluse) sits between consumers (developers, CI) and a package
registry, applying a configurable resilience policy before any dependency reaches
a build, without hosting packages itself. The name is French for a canal lock: a
chamber whose gates never open at once. Every dependency is held and cleared
through that controlled passage before it is admitted to a build.
The goal is resilience, not malware detection: shrink the blast radius of a bad publish (a hijacked maintainer account, a race-to-publish, a typosquat) rather than promise to recognise malice. Écluse is not a registry: storage is delegated to whatever backend the operator runs (AWS CodeArtifact, GCP Artifact Registry), and Écluse only governs what may be fetched from, and mirrored to, those backends. npm is the first ecosystem; the domain model is ecosystem-agnostic so PyPI and RubyGems can follow.
How a request is cleared
Écluse speaks a registry's native protocol across three read-path registries (the client's, a private upstream of already-vetted packages, and the public registry), and the two request shapes use them differently:
- A tarball request is gated for that one version: a private-upstream hit is streamed unfiltered (already vetted); on a miss, the proxy fetches the version's public metadata, evaluates the rules, and either streams it from public and enqueues an asynchronous mirror job or returns a denial.
- A packument (metadata) request is a merge: the private and public
upstreams are fetched in parallel, public versions are filtered by the rules
while private versions are trusted, and the two are combined into one document
(private wins a version collision, an integrity divergence is flagged as a
supply-chain signal, and
latestis repointed to the newest survivor).
Two properties run through both shapes: the rules engine is deny by default (a version is admitted only if some rule allows it and none denies it), and mirroring is demand-driven, so only versions actually pulled are mirrored, never on the request's critical path.
How the code is organised
Écluse is a functional core with effects at the edges: the policy and
protocol logic is pure and trivially testable, and IO is confined to a thin
shell. Swappable backends sit behind handles (records of functions chosen at a
single composition root), so a new cloud or a new ecosystem is an added
implementation behind an existing handle, not a structural change.
The library's vocabulary, roughly from the pure core outward:
- Domain model: Ecluse.Core.Package (the ecosystem-agnostic package vocabulary the rules reason over), Ecluse.Core.Version (version identity and per-ecosystem ordering), and Ecluse.Core.Ecosystem (the ecosystem tag the rest dispatches on).
- Policy: Ecluse.Core.Rules (deny-by-default evaluation) over the rule types in Ecluse.Core.Rules.Types.
- Protocol boundary: Ecluse.Core.Registry (the registry-protocol handle),
Ecluse.Core.Registry.Npm.Wire and Ecluse.Core.Registry.Npm.Project (the lenient npm
wire decoders and their projection onto the domain model),
Ecluse.Core.Registry.Npm.Route (the npm path grammar), and Ecluse.Core.Server.Route
(the shared serve-action
Routeset and the injected route classifier). - Cloud handles: Ecluse.Core.Credential (minting the mirror-target write token) and Ecluse.Core.Queue (the durable mirror-job hand-off to the worker).
- Mirror worker: Ecluse.Core.Worker (the supervised consume loop that fetches, verifies against the job's integrity digest, and publishes an approved artifact).
run is the entry point the ecluse executable invokes (see Main). It lives
in the library, not in app/Main.hs, so the composition root is a single
importable unit and app/Main.hs stays a thin shell that only calls it.
Further reading
docs/architecture.md is the systems-design index: the vision, the end-to-end
request lifecycle, and a map to the per-concern design documents. CONTRIBUTING.md
covers the codebase layout and testing strategy, and STYLE.md the coding and
documentation conventions.
Synopsis
- runProxy :: BootEnv -> IO ()
- runServer :: ServerConfig -> Env -> IO ()
- runWorker :: WorkerPolicies -> Env -> IO ()
- npmServerConfig :: ServerConfig
- mountBindingFor :: Ecosystem -> Maybe PackumentDeps -> Maybe PublishDeps -> Maybe MountBinding
- unconfiguredRegistry :: RegistryClient
- planCveSync :: AppConfig -> IO (Map Ecosystem CveSyncHandle)
- data CveSyncHandle = CveSyncHandle {}
- cveRuleDepsFor :: Map Ecosystem CveSyncHandle -> BreakerReporter -> Ecosystem -> RuleDeps
- cveSyncReady :: Map Ecosystem CveSyncHandle -> IO Bool
- cveSyncScheduleFor :: AppConfig -> SyncSchedule
Documentation
runProxy :: BootEnv -> IO () Source #
Start Écluse: the entry point the ecluse executable runs (see Main).
Assemble the composition root from configuration. Parse the environment layer and
the optional config document, validate everything and fail fast at boot on any
problem (a malformed env, an unresolved rule policy, a configured mount with no
adapter, a credential reference that does not resolve, or a mirror-queue backend
not built in this binary), aggregating the failures so a single run reports them
all. On success, build the handles (the shared HTTP Manager, the config-selected
mirror queue, the metadata cache, the logger, the process-global credential
provider, and the telemetry substrate, off unless ECLUSE_TELEMETRY enables it)
into an Env, derive the served mount bindings, then run the server and the mirror
worker concurrently over that single Env (runServer and runWorker).
Bracketing the Env (and the telemetry providers) for the lifetime of both tears
down their shared resources along every exit path.
runServer :: ServerConfig -> Env -> IO () Source #
Run the proxy's HTTP front door over the composition-root Env with the
config-derived ServerConfig.
This is the npm-aware composition site: mountBindingFor mounts npm -- its path
grammar (Ecluse.Core.Registry.Npm.Route) and its denial renderer
(Ecluse.Core.Registry.Npm.Serve) -- into the otherwise ecosystem-neutral web layer
(runServer), so the agnostic server stays closed over the shared
Route set and only this one place names an ecosystem.
Splitting the server into its own binary later reuses this same entry.
runWorker :: WorkerPolicies -> Env -> IO () Source #
Run the supervised mirror worker over the composition-root Env and the
per-ecosystem re-evaluation bundles: the consume → re-evaluate → fetch → verify → publish →
ack loop against the queue, the publish-side registry client, and the credential handle, in
the worker monad (WorkerM) over the worker runtime
(workerRuntimeOf). The bundles carry the same prepared rules and public origin
the serve path gates with, so the worker re-runs current policy against a job before
mirroring it.
This is the composition-root hoist point: it resolves the request-independent dd
correlation object (the service identity; no span is active at the worker entry) and
installs it as the worker's initial katip context, then discharges the loop to IO
through runWorkerM, the worker analogue of the serve path's
runHandler boundary. The loop logic lives in
Ecluse.Core.Worker; the single-process program runs this alongside runServer.
npmServerConfig :: ServerConfig Source #
The fallback server settings: a single npm mount with no packument-serve
or publish dependencies, so the packument route is the recognised-but-unserved 501
stub and a publish is 405 (no publication target). Exposed so the composed front
door can be driven directly without binding a socket (e.g. embedded in another wai
application, or exercised in tests through application) to assert the
routing and the unwired-mount surface; a real launch derives its bindings from
configuration in run.
mountBindingFor :: Ecosystem -> Maybe PackumentDeps -> Maybe PublishDeps -> Maybe MountBinding Source #
Resolve an Ecosystem to its complete MountBinding, or Nothing when that
ecosystem has no adapter wired. The ecosystem selects its path grammar (the
Classifier) and its denial renderer (the
MountRenderer), and its path prefix is derived
from it (prefixFor) rather than configured, so the ecosystem is the single thing
that drives the binding (see docs/architecture/hosting.md → Mounts). The
composition root supplies the packument-serve dependencies once the per-mount
registry set is resolved; Nothing for them leaves the packument route the
recognised-but-unserved 501 stub.
npm is the only ecosystem with an adapter; the others have no registry client or
renderer, so they resolve to Nothing, a loud miss at the call site rather than a
silently half-wired mount.
unconfiguredRegistry :: RegistryClient Source #
A registry handle with no backend behind it: every effectful field __refuses
loudly__ (a typed RegistryUnconfigured) and every pure parse* field returns
Left, so an unconfigured fetch/publish or parse fails explicitly rather than
silently returning a fabricated success. It holds the handle slot in the
composition root where a configured backend is selected elsewhere.
planCveSync :: AppConfig -> IO (Map Ecosystem CveSyncHandle) Source #
Build the advisory-sync plan from config: nothing without a configured
vulnerability-database bucket; otherwise one CveSyncHandle per configured
mount ecosystem, each against its own stable per-ecosystem object key and
canonical on-disk path under the OSV data dir. Prepares the data dir (created
if missing; stray .tmp downloads from an interrupted run swept) so the sync
tasks start clean. Note the readiness consequence: an operator who mounts an
ecosystem Pilot does not compile has declared an artifact that never arrives,
and the pod honestly never reports ready.
data CveSyncHandle Source #
One configured ecosystem's advisory-sync wiring.
cveRuleDepsFor :: Map Ecosystem CveSyncHandle -> BreakerReporter -> Ecosystem -> RuleDeps Source #
The rules' boot-bound capabilities for one mount ecosystem: the CVE lookup borrows through that ecosystem's own slot when the sync plan carries one, and abstains otherwise, so a mount's rules can never read a neighbouring ecosystem's advisory database.
cveSyncReady :: Map Ecosystem CveSyncHandle -> IO Bool Source #
The readiness gate over the sync plan: ready once every configured ecosystem's advisory database has first-synced. The flags flip one way, so readiness never flaps on this; an empty plan (no bucket) is vacuously ready.
cveSyncScheduleFor :: AppConfig -> SyncSchedule Source #
The sync tasks' timing: the shipped boot burst over the configured poll
interval. The microsecond conversion cannot wrap: the config decoder bounds
the interval to [1, maxBound div 1_000_000] seconds.