ecluse:ecluse-core
Safe HaskellNone
LanguageGHC2021

Ecluse.Core.Server.Pipeline.Packument

Description

The serve paths behind the package routes: the packument merge behind GET /{pkg}.

This is the data-plane handler module for packuments. It composes the slices that decide what to serve -- the registry client (Ecluse.Core.Registry.Npm), the per-version rules (Ecluse.Core.Rules), the structural filter (Ecluse.Core.Registry.Npm.Filter), the cross-upstream merge (Ecluse.Core.Package.Merge), the metadata cache (Ecluse.Core.Server.Cache), the own-ETag conditional (Ecluse.Core.Server.Conditional), and the serve-outcome status (Ecluse.Core.Server.Response) -- into one action in the Handler reader, reading its mount's serve dependencies and the request runtime ServeRuntime from the request's RequestCtx.

Credential authority

This handler implements the default passthrough credential posture (see docs/architecture/access-model.md). The invariant that holds under every strategy is the public strip: the client's credential is __stripped before any public-upstream fetch__, which is always anonymous -- sending an internal token to the public registry would be a credential disclosure, so the public-upstream fetch is built with no token at all. Under passthrough the client's own credential is additionally forwarded verbatim to the private upstream, which is the authority for who may read what. The two origins are fetched concurrently, each with its own credential posture; nothing shares a token across the trust split.

Because passthrough makes the private upstream the per-client authority, its metadata is not cached across clients here: the private origin is fetched and parsed on every request with that client's own credential, so the upstream re-authorises each client itself, and only the anonymous public origin is cached (one shared document, no per-client authority to preserve). Caching the private origin keyed by base URL alone would let one client's cached entry serve another client's private document within the TTL, bypassing the upstream's authorisation -- a cross-client disclosure. (Other strategies make the private origin shareable by authorising each serve differently; the metadata cache itself stays credential-free regardless -- see docs/architecture/access-model.mdCaching.)

Merge, not fallback

A packument is the set of available versions, spread across upstreams, so it is merged rather than short-circuited on a private hit (see docs/architecture/registry-model.md → "Packument merge across upstreams"). Private versions are trusted and enter unfiltered; public versions are gated through the rules and the structural filter (the FilterPlan's survivors restrict the typed view) before they enter; the two are combined, private winning a collision and an integrity divergence flagged. If one upstream is unavailable while the other succeeds, the best-effort union of what resolved is served -- only when nothing resolves does the request error.

Decision surface vs served surface

The merge and filter reason over the typed PackageInfo but the document served is the raw upstream JSON, so every unmodeled wire key survives (see docs/architecture/registry-model.md → "Decision surface vs served surface"). The MergePlan names, for each surviving version, the source that won it; the served body is assembled in one pass by the mount's assembly hook (assembleMergedPackument for npm): each survivor's object is taken from the raw Value of its winning source with its tarball URL rewritten under the mount base as it is placed, the reconciled dist-tags and time are carried from the plan, and every other top-level key is relayed from the precedence-winning document. The typed model is never re-serialised. The two fields the merge owns as a decision -- dist-tags.latest and the time instants -- are re-rendered from that decision (the times as normalised ISO-8601), so they may differ byte-for-byte from any single upstream while denoting the same value; integrity-bearing fields (dist.integrity, dist.tarball up to the rewrite's own prefix) are relayed raw and untouched. The served bytes get our own ETag, since a merged/filtered body matches no single upstream's.

Synopsis

Documentation

servePackument :: PackageName -> Request -> (Response -> IO ResponseReceived) -> Handler ResponseReceived Source #

Serve a GET /{pkg} packument request end to end, over the request's RequestCtx.

The mount's PackumentDeps and error renderer are read from the matched MountBinding in context, not threaded as arguments. When the mount has no packument-serve dependencies wired, the route is recognised but not served -- a 501 in the mount's surface -- rather than fabricating a result.

With dependencies wired: the edge token, if configured, is validated before any upstream is touched. Then the private and public upstreams are fetched concurrently -- the client's credential forwarded to the private origin, the public origin anonymous -- each parse failure or unavailable upstream degrading to a missing contribution rather than an error. Private versions are trusted as-is; public versions are gated through the rules and the structural filter (the FilterPlan); the surviving sets are merged (mergePackuments) and the MergePlan assembled onto the raw upstream Values to build the served body, which is then answered against the client's conditional request with our own ETag. When nothing survives, the status follows the most recoverable cause via packumentStatus. An origin whose self-reported packument name disagrees with the route is validated out -- dropped as untrusted for this request and logged -- so a single misreporting upstream never denies a package another upstream serves; when that leaves no valid origin, the request is a 502 (a responding upstream returned an invalid response), distinct from a genuine absence. Every refusal -- the edge 401 and the no-survivors 403/503/502/500 -- is rendered through the mount's MountRenderer.

headPackument :: PackageName -> Request -> (Response -> IO ResponseReceived) -> Handler ResponseReceived Source #

Serve a HEAD /{pkg} packument request: the identical pipeline and gating as servePackument -- the same fetch, merge, filter, rule decision, and no-survivors status -- answered with the identical status and headers as the GET (the would-be merged body's Content-Length and the own ETag the conditional-request machinery computes), but with the body suppressed (bodiless), as HTTP semantics require of a HEAD reply.

A packument body is assembled locally (a metadata fetch plus the cross-upstream merge), so -- unlike the tarball HEAD (headTarball) -- answering it pumps __no artifact body__ and carries no egress-amplification risk: this is the HTTP-correctness half of the explicit-HEAD handling, not the DoS lever the tarball path closes. The merged body is still materialised, to size it and compute its ETag; only the bytes are withheld from the reply.

The derived validator (exported for its unit spec)

packumentETag :: Text -> PackageName -> [(Provenance, ContentDigest, [Text])] -> ETag Source #

The derived packument validator: a SHA-256 over the serve's inputs -- the mount base URL, the package name, and per source (in merge order) its provenance, its origin body's digest, and the version keys that survived its gate.

The served document is a deterministic function of exactly these (the merge plan derives from the gated typed views, which derive from the origin bytes and the survivor sets; the assembly then edits the origin documents under the mount base URL), so this tag can never call a changed document unchanged. It may change when the re-assembled bytes would not have -- a spurious 200, never a wrong 304 -- which is the correct slack for a validator. Deriving it from inputs is what lets a 304 skip assembly, encoding, and any output hashing entirely.

Fields are fed to the hash with unambiguous framing: the digest is fixed-width, the variable-length pieces are NUL-terminated, and each source block closes with an \SOH terminator, so no concatenation of adjacent fields can collide with another split of the same bytes. The leading salt versions the scheme: bump it when the assembly's behaviour changes so pre-change client caches revalidate as modified.