ecluse
Safe HaskellNone
LanguageGHC2021

Ecluse.Composition

Description

The composition-root wiring: turn a validated Config and the process-global credential providers into the served MountBindings, failing fast and aggregated on any boot problem.

This is the listener-free heart of the composition root (Ecluse calls it): it holds no sockets, no network, and no real clock of its own -- the clock and the ecosystem-to-adapter resolver are injected -- so the boot-time validation is unit-tested without opening a listener. Its one effect is preparing each mount's rule set (prepare), which allocates per-rule engine state once at boot (a breaker for a resilient rule; the built-in rules need none today), so binding assembly is IO; everything else stays a pure function of the validated config.

Global providers, per-mount reference

A CredentialProvider is the service's own cloud identity, built once from the environment layer (initCredentialProviders) and held process-global; a mount does not carry a provider, it only names which backend it draws on (its mtCredential). The boot-time check is the resolution of that reference: every distinct credential backend named across all mounts must resolve to an initialised provider, or the app halts at boot (see docs/architecture/cloud-backends.md → "Credential Provider"). Only the static backend has a leaf (from ECLUSE_MIRROR_TARGET_TOKEN); a mount naming codeartifact or adc resolves to no provider and is an honest boot failure.

Fail-fast at boot

Three boot failures are aggregated into one report so a single run shows every problem: a rule policy that does not resolve (PolicyBootError, surfaced by loadConfig), a configured mount whose ecosystem has no adapter wired (MissingAdapter), and a mount naming a credential backend with no initialised provider (UnresolvedCredential). A bad configuration is thus a loud, immediate startup failure, never a quietly mis-enforced or half-wired state (see docs/architecture/configuration.mdValidation).

Synopsis

Global credential providers

data CredentialProviders Source #

The process-global credential providers, keyed by the backend they implement. Built once at the composition root from the environment layer; a mount references one by name and never holds its own.

The keyset (see initializedBackends) is the boot-check's pure surface -- a mount that names a backend absent from it has an unresolved credential reference.

initCredentialProviders :: CredentialReporters -> AppConfig -> IO (Either [BootError] CredentialProviders) Source #

Build the global credential providers from the environment layer, or the aggregated boot errors that block them. The mirror-target write provider is selected by cfgCredentialProvider (see planMirrorCredential):

  • static -- built from ECLUSE_MIRROR_TARGET_TOKEN (cfgMirrorTargetToken) when set; absent, no static provider is initialised, so a mount naming static fails the boot-time credential-reference check.
  • codeartifact -- the CodeArtifact inputs are resolved (resolveCodeArtifactConfig); a required input that resolves by neither an explicit key nor the mirror-target host is a fail-loud boot error. On success the generic refresh/cache wrapper around the CodeArtifact mint leaf (newCodeArtifactProvider) is built, which mints once eagerly -- so a misconfigured identity fails here at boot. AWS credentials are the ambient container/task role (the standard chain), never an Écluse key. A mint that throws at boot (a transient AWS error, or a permanent one like a bad domain/region or missing permission) is caught and rendered as a CodeArtifactMintFailed boot error rather than escaping as a raw exception, so it joins the aggregated failure block.
  • gcp-artifact-registry -- recognised but not built in this binary, so selecting it is a fail-loud boot error rather than a silent fall-through.

The static provider is also built whenever ECLUSE_MIRROR_TARGET_TOKEN is present, independent of the selector, so a static token never goes unused.

The CredentialReporters are handed to the refreshing CodeArtifact provider so its mint breaker and refresh outcomes record to telemetry; the static provider never refreshes, so they do not concern it. The composition root supplies the deferred reporters that go live once the telemetry substrate exists.

initializedEcosystems :: CredentialProviders -> Set Ecosystem Source #

The set of ecosystems that resolved to an initialised provider -- the pure surface the boot-time credential-reference check reasons over.

lookupProvider :: Ecosystem -> CredentialProviders -> Maybe CredentialProvider Source #

Look up the initialised provider for an ecosystem, Nothing when none is initialised (the unresolved-reference case the boot check rejects).

Mirror-target credential provider selection

planMirrorCredential :: Ecosystem -> AppConfig -> MountConfig -> Either [BootError] (Maybe CodeArtifactConfig) Source #

Decide what mirror-target write provider the environment layer selects, as the pure half of initCredentialProviders: Nothing when the static provider is selected (its leaf is the ECLUSE_MIRROR_TARGET_TOKEN already handled there), Just a resolved CodeArtifactConfig when codeartifact is selected, or the aggregated boot errors that block the selection.

The gcp-artifact-registry arm is recognised but not built in this binary, so it is a fail-loud MirrorCredentialProviderUnavailable boot error -- never a silent fall-through to a different provider, mirroring how planMirrorQueue treats the GCP queue arm.

resolveCodeArtifactConfig :: Ecosystem -> AppConfig -> MountConfig -> Either [BootError] CodeArtifactConfig Source #

Resolve the CodeArtifact inputs for the mirror-target token, or the aggregated boot errors naming each input that could not be resolved.

Each required input is resolved __(a) from its explicit MIRROR_TARGET_CODEARTIFACT_* key, else (b) by parsing the mirror-target URL host__ of the form {domain}-{owner}.d.codeartifact.{region}.amazonaws.com (the documented host fallback). The region resolves explicit key → host → AWS_REGION: the endpoint host encodes the domain's authoritative region, so it outranks the process-wide AWS_REGION (a cross-region deploy mints against the domain's region, not the caller's). The mirror-target URL is the resolved one -- an unset ECLUSE_MIRROR_TARGET has already folded onto the private upstream -- so a private-upstream CodeArtifact endpoint is parsed too. The optional token-duration carries through (cfgMirrorCodeArtifactTokenDuration).

The {owner} is a 12-digit AWS account id: a resolved owner (from either source) that is not 12 digits is a fail-loud CodeArtifactConfigInvalid error, and a host whose tail after the last hyphen is not an account id is not a CodeArtifact endpoint at all (so it falls through to the named-key check). If a required input resolves by neither source, that is a fail-loud CodeArtifactConfigMissing boot error naming the exact key the operator must set, aggregated so one run reports every problem.

Boot-time wiring

data BootError Source #

A reason the composition root refuses to start. Every case is a fail-loud boot failure; they are aggregated so a single run reports every problem an operator must fix.

Constructors

PolicyBootError PolicyError

A rule policy did not resolve (surfaced by loadConfig).

MissingAdapter Ecosystem

A configured mount's ecosystem has no adapter wired, so it cannot be served (a loud miss, never a silent drop). Carries the ecosystem.

UnresolvedCredential Ecosystem CredentialBackend

A mount names a credential backend with no initialised provider. Carries the ecosystem of the mount and the unresolved backend.

QueueProviderUnavailable QueueBackend

The configured mirror-queue backend has no implementation compiled into this binary, so no queue can be built for it. Carries the unavailable backend. An honest refusal -- never a silent fall-through to a different backend.

QueueRegionMissing

The SQS mirror-queue backend was selected but no AWS region was supplied (AWS_REGION), so the queue cannot be scoped to a region.

QueueUrlMissing QueueBackend

A cloud mirror-queue backend (e.g. sqs) was selected but no ECLUSE_QUEUE_URL was supplied, so there is no queue to send jobs to. The in-memory backend does not raise this -- it has no external queue.

QueueEndpointMalformed Text

The configured SQS endpoint override (AWS_ENDPOINT_URL_SQS / AWS_ENDPOINT_URL) is not a parseable endpoint URL. Carries the offending value.

MirrorCredentialProviderUnavailable CredentialBackend

The selected mirror-target credential provider has no implementation compiled into this binary. Carries the unavailable provider. An honest refusal, never a silent fall-through.

CodeArtifactConfigMissing Text

A required CodeArtifact input for the mirror-target token could not be resolved from either its explicit key or the mirror-target host. Carries the name of the key the operator must set.

CodeArtifactConfigInvalid Text Text

A CodeArtifact input resolved but is malformed (e.g. a domain owner that is not a 12-digit AWS account id). Carries the key and a reason.

CodeArtifactMintFailed Text

The eager boot-time CodeArtifact mint threw -- a transient AWS error (worth a retry) or a permanent one (a bad domain/region or missing permission, to be fixed). Carries the rendered exception so the cause is legible and aggregated.

PublishScopesMissing Ecosystem

A publication target was configured (ECLUSE_PUBLICATION_TARGET) but no publish-scope allow-list (ECLUSE_PUBLISH_SCOPES) was supplied, so the anti-shadowing guard would have nothing to enforce. Refused at boot rather than defaulting to an empty allow-list (which would deny every publish) or an open one (which would let a client shadow any public name).

PublishStaticCredentialNeedsEdge Ecosystem

A static publish credential (ECLUSE_PUBLICATION_TARGET_TOKEN) was configured without a verifiable inbound edge (ECLUSE_AUTH_TOKEN). Écluse would otherwise substitute its own standing write credential for a publishing caller who forwards none, so an unauthenticated request could publish within the configured scopes under Écluse's own identity. Refused at boot so an internal publish credential paired with an open edge is unrepresentable -- the write-side counterpart of the fail-closed read identity.

Instances

Instances details
Show BootError Source # 
Instance details

Defined in Ecluse.Composition

Eq BootError Source # 
Instance details

Defined in Ecluse.Composition

renderBootError :: BootError -> Text Source #

Render a BootError as a human-facing line for the aggregated failure block.

planMounts :: (Ecosystem -> Maybe PackumentDeps -> Maybe PublishDeps -> Maybe MountBinding) -> IO UTCTime -> (Ecosystem -> RuleDeps) -> CredentialProviders -> Config -> IO (Either [BootError] [MountBinding]) Source #

Validate the environment layer and optional document into the served mount bindings, or the aggregated boot errors. The composition root's single entry: it runs loadConfig (whose policy errors become PolicyBootErrors) and then composeBindings, so policy, missing-adapter, and unresolved-credential failures all surface from one call.

The ecosystem-to-adapter resolver, the wall-clock source, and the rules' boot-bound capabilities are injected (the composition root supplies mountBindingFor, getCurrentTime, and each ecosystem's RuleDeps), so this validation opens no socket. The capabilities are per ecosystem because a mount's rules must borrow their ecosystem's advisory database, never a neighbour's. It is IO only because composeBindings prepares each mount's rules (allocating per-rule engine state once at boot).

composeBindings :: (Ecosystem -> Maybe PackumentDeps -> Maybe PublishDeps -> Maybe MountBinding) -> IO UTCTime -> (Ecosystem -> RuleDeps) -> CredentialProviders -> Config -> IO (Either [BootError] [MountBinding]) Source #

Turn a validated Config into the served MountBindings, or the aggregated boot errors. For each mount, in ecosystem order: its credential reference must resolve to an initialised provider, and its ecosystem must resolve to an adapter (through the injected resolver, fed real PackumentDeps so the packument route is served rather than the 501 stub). Errors aggregate across every mount.

Mirror-queue backend selection

data MirrorQueuePlan Source #

Which mirror-queue backend the composition root will build, resolved from config: the durable AWS sqs backend (with its SqsConfig), or the bounded best-effort in-memory backend (with its MemoryQueueConfig). The pure decision planMirrorQueue yields; the composition root pattern-matches it to make the one constructor call, and mirrorQueuePlanWarning tells it whether a boot warning is due.

Constructors

SqsBackend SqsConfig

The durable AWS SQS backend, built by Ecluse.Core.Queue.Sqs.newSqsQueue.

MemoryBackend MemoryQueueConfig

The bounded in-memory backend, built by newBoundedInMemoryQueue. Non-durable and best-effort -- boot warns.

planMirrorQueue :: AppConfig -> Either [BootError] MirrorQueuePlan Source #

Select the mirror-queue backend from the environment layer, yielding the MirrorQueuePlan the composition root builds the queue from, or the aggregated boot errors that block it.

This is the pure half of the queue's backend choice -- the single place that knows which backends this binary can build. The AWS sqs backend resolves to a SqsBackend carrying its SqsConfig (the queue URL and region, with the provider knobs at their defaults); the composition root passes that to Ecluse.Core.Queue.Sqs.newSqsQueue. The memory backend resolves to a MemoryBackend carrying its depth cap, built in-process with no cloud queue (ECLUSE_QUEUE_URL and AWS_REGION are not consulted for it) -- an explicit operator choice for a simple, single-node, or air-gapped deployment, never an automatic fallback (which would soften the fail-loud-on-misconfig posture); the composition root emits the memoryQueueBootWarning on selection. The GCP pubsub arm is recognised but not built, so it is a fail-loud QueueProviderUnavailable boot error rather than a silent fall-through. ECLUSE_QUEUE_URL is optional at the env layer; it is required here for sqs (the jobs need a queue), so a missing one is a fail-loud QueueUrlMissing boot error, and a missing AWS_REGION under sqs is a QueueRegionMissing boot error -- the sqs arm aggregates the region, queue-URL, and endpoint failures, and the whole result is a list so it aggregates with the rest of the boot-time validation.

When an endpoint override is configured (AWS_ENDPOINT_URL_SQS, else AWS_ENDPOINT_URL -- the AWS-SDK-standard variables), it is parsed into the backend's SqsEndpoint so the released image can target a local emulator (ministack) or a VPC endpoint without a test-only code path; a malformed override URL is a fail-loud QueueEndpointMalformed boot error. With no override, the SQS backend uses AWS's default endpoint and credential resolution.

mirrorQueuePlanWarning :: MirrorQueuePlan -> Maybe Text Source #

The loud boot warning a MirrorQueuePlan warrants before its queue is built, or Nothing for a durable backend that needs none. The composition root logs the Just at WarningS on selection, so an operator who chose the in-memory backend is told plainly that the mirror is non-durable -- never a silent surprise.

memoryQueueBootWarning :: Text Source #

The boot warning emitted when the in-memory mirror-queue backend is selected: it states plainly that the mirror is in-memory, non-durable, and best-effort, and that a lost job is re-mirrored on the next demand (so there is no data loss, only deferred mirroring), so the choice is never mistaken for a durable cloud backend.

memoryQueueDropWarning :: Int -> Text Source #

The cap-overflow drop warning for the in-memory backend, carrying the running total of dropped jobs (this report is rate-limited at the queue, so it does not fire per dropped job). A note on a one-line follow-up: a drop metric (ecluse.mirror.*, S26 PR2) hooks in alongside this log once that catalogue lands.

Publish-side wiring

data PublishTarget Source #

One ecosystem's resolved publish target: the mirror-target endpoint the mirror worker writes approved artifacts to, paired with the credential provider that mints its bearer token.

This is the publish side of the per-ecosystem composition (the serve side is the mount's PackumentDeps). The worker's single consumer builds a registry-protocol client from these -- the endpoint as its base URL, the provider's token as its bearer -- so the publish client is resolved here at the composition root rather than re-derived per request.

Constructors

PublishTarget 

Fields

  • ptEcosystem :: Ecosystem

    The ecosystem this publish target serves.

  • ptMirrorUrl :: Text

    The mirror-target endpoint approved artifacts are published to.

  • ptCredentials :: CredentialProvider

    The provider minting the mirror-target write token.

planPublishTargets :: CredentialProviders -> Config -> Either [BootError] [PublishTarget] Source #

Resolve each configured mount to its publish target, or the aggregated boot errors. The publish side of planMounts: it validates the same config and resolves each mount's mirror-target endpoint and write credential, so the worker's publish client can be built at the composition root.

An unresolved credential reference is the same fail-loud boot error composeBindings reports for the serve side, so the two surfaces never disagree on what is wired.

Config-derived runtime settings

cacheConfigFor :: AppConfig -> CacheConfig Source #

The metadata-cache tunables drawn from the validated environment layer -- its TTL and entry bound -- so a deployment's cache settings flow from config rather than the built-in defaults (see Ecluse.Core.Server.Cache).

connectionPoolSettings :: Int -> ManagerSettings -> ManagerSettings Source #

Apply an explicit per-host connection bound to an HTTP manager's settings.

The public and private managers call this independently after telemetry instrumentation, so changing the pool size cannot discard the instrumented request and response hooks.

resolveServeAdmission :: Maybe Int -> Int -> (Int, Text) Source #

The effective serve-admission capacity and its boot-log line: the explicit serveMaxInFlight when configured, else __computed from the resolved capability count__ -- max 8 (10 x capabilities).

The multiplier is empirical, not modelled. The saturation model (an admitted metadata materialisation alternates upstream wait W and CPU work P, so keeping C capabilities busy wants about C x (W + P) / P in flight) suggested ~4 per capability at a round-trip W/P of 2-3, but the load bench's measured dose-response kept climbing well past that and levelled only near 10 per capability: a slot is held across every upstream leg plus GC pauses and scheduling delay, so the effective W/P is nearer 9-10. The floor keeps a tiny pod admitting a useful burst should the multiplier ever drop below it. The capability count must be the post-runtime-posture one (see Ecluse.Runtime), so callers resolve this after applyRuntimePosture has run.

The returned line carries the decision's provenance for the standard boot log, alongside the runtime posture lines. This bounds only metadata materialisation (whole packument requests and a tarball miss's public-metadata gate). The private connection pool is not sized from it -- see resolvePrivateConnections: a trusted tarball hit streams outside admission, so demand on the private pool is the inbound hit concurrency, not the admission capacity, and tying the two would undersize that pool under a private-hit fan-out (http-client opens throwaway connections beyond the pool, paying a TLS handshake per overflow request).

resolvePrivateConnections :: Maybe Int -> Int -> (Int, Text) Source #

The effective private-upstream connection-pool size and its boot-log line: the explicit privateConnectionsPerHost when configured, else __computed from the process file-descriptor limit__ -- clamp 64 4096 (nofile / 4).

The private pool caches idle connections to the trusted upstream for __reuse across concurrent private-hit tarball streams. Those streams are IO-bound__ and, unlike metadata materialisation, stream outside serve admission, so their concurrency (and thus the pool's real demand) is the inbound hit fan-out, not the CPU-saturation model resolveServeAdmission uses -- which is exactly why this is computed from a different datapoint and is not tied to serveMaxInFlight (see issue #634's incomplete inference: the private pool also serves the un-admitted streaming path).

Each pooled connection is one file descriptor, so the file-descriptor limit is the pool's real physical ceiling. The default takes a quarter of the soft RLIMIT_NOFILE as the reuse cache, floored at privateConnectionsFloor so a small-limit host still reuses connections across an install fan-out, and capped at privateConnectionsCap so an enormous-limit host does not retain an absurd idle cache to a single upstream. A larger pool never opens more sockets than the concurrency already demands (http-client opens a connection per in-flight request regardless); it only decides how many to __retain for reuse__ rather than re-handshake, so sizing up is safe. An operator who knows their fan-out can override it outright.

The returned line carries the decision's provenance for the standard boot log.

resolvePublicConnections :: Maybe Int -> Int -> (Int, Text) Source #

The effective public-upstream connection-pool size and its boot-log line: the explicit publicConnectionsPerHost when configured, else __computed from the process file-descriptor limit__ -- clamp 32 1024 (nofile / 8).

The public pool's metadata demand is small by construction (same-key misses are single-flight-coalesced and bounded by admission), but the pool is not metadata-only: the onboarding fail-over's artifact streams and the mirror worker's back-fill fetches ride the same manager, and neither coalesces. During a cold fleet's onboarding burst the concurrent public streams track the inbound fan-out, and managerConnCount is a keep-alive retention cap, not a concurrency cap: overflow opens throwaway connections, each paying a TLS handshake to the public origin per request. So the pool is sized like the private one, from the file-descriptor budget, at half the private share (an eighth of nofile, from the three quarters the private sizing reserves for everything else): the public leg is transient by the traffic model -- the worker retires it artifact by artifact -- so it earns retention for the burst, not the steady state. Sizing up is safe for the same reason as the private pool: it never opens more sockets than the concurrency already demands, only retains more for reuse.

The returned line carries the decision's provenance for the standard boot log.

openFileSoftLimit :: IO Int Source #

The process soft file-descriptor limit (RLIMIT_NOFILE), the datapoint resolvePrivateConnections sizes the private pool against. An infinite or unknown limit falls back to privateConnectionsCap x privateConnectionsFdShare, so the computed pool lands at the cap rather than overflowing.

mirrorEnqueueBufferDepth :: Int Source #

The depth of the producer-side hand-off buffer the composition root wraps in front of the mirror queue (newEnqueueBuffer). Sized to absorb a cold npm ci's enqueue burst (a lockfile fan-out enqueues one job per public-served tarball) while bounding memory; a job dropped at the cap is re-enqueued on the next demand for its artifact, so overflow costs a deferred mirror, never correctness.

mirrorEnqueueReportInterval :: Int Source #

How many enqueue-buffer drops or delivery failures pass between warning-log reports at the composition root (the first is always reported, then every multiple of this). The buffer's callbacks fire per event so the failure counter stays exact, while a sustained flood logs one line per this many events rather than one per job.

Internals exported for testing