| Safe Haskell | None |
|---|---|
| Language | GHC2021 |
Ecluse.Runtime
Description
Resolving and applying the process's runtime posture -- how many capabilities Écluse claims and what heap ceiling it runs under -- from first-class configuration with a cgroup-derived fallback, logged at boot with each decision's provenance.
The GHC RTS sizes itself from what the machine looks like: bare -N claims a
capability per visible processor, and the heap is unbounded unless -M says
otherwise. In a container neither default matches the pod: a CPU limit is a
cgroup quota that does not shrink the visible processor count, so the RTS claims a
whole node's worth of capabilities under a two-CPU quota, and the only memory
backstop is the kernel OOM killer. This module closes that gap the way Go's
automaxprocs does, but config-first:
- Explicit configuration wins:
cores(ECLUSE_CORES) andmaxHeapBytes(ECLUSE_MAX_HEAP_BYTES). - Omitted values fall back to the cgroup (v2):
cpu.max's quota, floored (at least one) and clamped to the visible processors, andmemory.maxless the nursery budget and slack (deriveMaxHeapBytes). Flooring follows Go'sautomaxprocs: a capability count above the budget lets a stop-the-world collection outrun the CFS quota and freeze mid-pause, so a fractional entitlement is stranded rather than borrowed against. - No limit found either way: the posture the RTS already resolved (its baked
defaults plus any
GHCRTSthe operator set) stands, and the log says so.
Every decision is logged through the standard boot log with its provenance
(renderRuntimePosture), so an operator reads what was decided or interpreted
straight from the start-up lines.
This resolution is role-agnostic on purpose, and only the resolution: cores and
the heap ceiling derive from the container's limits, which bind every role (proxy,
Pilot, Dredger) alike. Workload-shaped tuning -- the allocation area, sized for the
proxy's serve path -- is deliberately not modelled per role here; a role whose
profile diverges is tuned per-deployment via GHCRTS until its shape earns a
default of its own.
Applying the plan: setNumCapabilities, or one exec-in-place
A capability change is applied in-process (setNumCapabilities). The heap
ceiling has no in-process setter -- -M is fixed when the RTS starts -- so when the
plan requires one, the boot re-executes its own binary once with the resolved
flags appended to GHCRTS (later flags win, verified against GHC 9.10). The exec
replaces the program image in the same process: the PID never exits, so a container
supervisor sees an uninterrupted process, exactly as an exec-ing entrypoint script
behaves. A marker variable (reexecMarker) guards against loops: the re-launched
process sees it, skips any further exec, and only logs (a warning, if the RTS still
diverges from the plan -- an operator's GHCRTS fighting the config, or a flag the
RTS rejected). A failure of the exec call itself is likewise degraded to a warning
and an unenforced posture: tuning never loops the boot and never takes the service
down.
The pure resolution (resolveRuntimePlan), the cgroup parsing (parseCpuMax,
parseMemoryMax), and the rendering are separated from the thin IO shell
(applyRuntimePosture) so the precedence and arithmetic are unit-tested without a
cgroup in sight. Sizes are bytes everywhere here; the RTS flag fields count 4 KiB
blocks and are converted at the read boundary (rtsBlockBytes).
Synopsis
- applyRuntimePosture :: (Text -> IO ()) -> (Text -> IO ()) -> Maybe Int -> Maybe Int -> IO ()
- data RtsPosture = RtsPosture {}
- data CgroupLimits = CgroupLimits {}
- data Provenance
- data RuntimePlan = RuntimePlan {
- planCapabilities :: (Int, Provenance)
- planMaxHeapBytes :: (Maybe Int, Provenance)
- resolveRuntimePlan :: Maybe Int -> Maybe Int -> CgroupLimits -> RtsPosture -> RuntimePlan
- deriveMaxHeapBytes :: Int -> Int -> Int -> Int
- requiredRtsFlags :: RtsPosture -> RuntimePlan -> [Text]
- renderRuntimePosture :: RuntimePlan -> RtsPosture -> [Text]
- parseCpuMax :: Text -> Maybe Double
- parseMemoryMax :: Text -> Maybe Int
- parseCgroupSelfPath :: Text -> Maybe Text
- ancestorPaths :: Text -> [Text]
Applying the resolved posture at boot
applyRuntimePosture :: (Text -> IO ()) -> (Text -> IO ()) -> Maybe Int -> Maybe Int -> IO () Source #
Resolve the runtime plan and apply it, first thing at boot.
Reads the live posture and the cgroup, resolves the plan against the given config values, and then:
- plan already in force: log the posture lines and return;
- only the capability count differs: apply it in-process
(
setNumCapabilities), log, and return; - a heap ceiling must be enforced: append the required flags to
GHCRTSand exec this binary in place (same PID, same arguments), once, guarded byreexecMarker. The re-launched process resolves the same plan, finds it in force, and logs the posture lines as normal.
When the marker is already set and the posture still diverges (an operator's
GHCRTS contradicting the config, or a flag the RTS rejected), the divergence is
logged as a warning and the process continues with what the RTS gave it -- boot
never loops and never aborts over tuning.
The pure resolution core
data RtsPosture Source #
The RTS posture the process is actually running with, in bytes. Read once at
boot (currentRtsPosture); the plan is resolved against it and the log renders it.
Constructors
| RtsPosture | |
Fields
| |
Instances
| Show RtsPosture Source # | |
Defined in Ecluse.Runtime Methods showsPrec :: Int -> RtsPosture -> ShowS # show :: RtsPosture -> String # showList :: [RtsPosture] -> ShowS # | |
| Eq RtsPosture Source # | |
Defined in Ecluse.Runtime | |
data CgroupLimits Source #
What the cgroup (v2) grants this process: the CPU quota in cores
(cpu.max, quota over period) and the memory ceiling in bytes (memory.max).
Nothing per axis when the file is absent (not a cgroup-v2 environment) or the
value is the unlimited max sentinel.
Constructors
| CgroupLimits | |
Fields | |
Instances
| Show CgroupLimits Source # | |
Defined in Ecluse.Runtime Methods showsPrec :: Int -> CgroupLimits -> ShowS # show :: CgroupLimits -> String # showList :: [CgroupLimits] -> ShowS # | |
| Eq CgroupLimits Source # | |
Defined in Ecluse.Runtime | |
data Provenance Source #
Where a resolved value came from, for the boot log's provenance clause.
Constructors
| FromConfig | Explicit Écluse configuration ( |
| FromCgroup | Derived from the cgroup limits. |
| FromRts | Left as the RTS resolved it (baked defaults plus any operator |
Instances
| Show Provenance Source # | |
Defined in Ecluse.Runtime Methods showsPrec :: Int -> Provenance -> ShowS # show :: Provenance -> String # showList :: [Provenance] -> ShowS # | |
| Eq Provenance Source # | |
Defined in Ecluse.Runtime | |
data RuntimePlan Source #
The resolved runtime posture: the capability count to run with and the heap
ceiling to enforce, each with its provenance. A FromRts entry means "leave it
alone": the plan never overrides a posture it has no better information than.
Constructors
| RuntimePlan | |
Fields
| |
Instances
| Show RuntimePlan Source # | |
Defined in Ecluse.Runtime Methods showsPrec :: Int -> RuntimePlan -> ShowS # show :: RuntimePlan -> String # showList :: [RuntimePlan] -> ShowS # | |
| Eq RuntimePlan Source # | |
Defined in Ecluse.Runtime | |
resolveRuntimePlan :: Maybe Int -> Maybe Int -> CgroupLimits -> RtsPosture -> RuntimePlan Source #
Resolve the runtime plan from the three layers, strongest first: explicit config, then the cgroup, then the live RTS posture.
Capabilities: an explicit cores wins; else the cgroup CPU quota rounded up
(a 0.5-CPU pod still needs one capability) and clamped to the visible processors;
else the RTS's own count stands. Always at least 1.
Heap ceiling: an explicit maxHeapBytes wins; else deriveMaxHeapBytes over the
cgroup memory limit and the planned capability count (the nursery the process
will actually run with); else the RTS posture stands -- notably, an operator's
GHCRTS -M is never overridden by mere derivation, and an absent limit is left
absent rather than fabricated.
deriveMaxHeapBytes :: Int -> Int -> Int -> Int Source #
The heap ceiling derived from a cgroup memory limit: the limit less the nursery budget (capabilities x allocation area -- memory the process spends over and above the heap) less 10% slack for stacks, buffers, and the RTS itself, floored at half the limit so a nursery mis-sized for a tiny pod still yields a sane ceiling rather than a vanishing (or negative) one.
requiredRtsFlags :: RtsPosture -> RuntimePlan -> [Text] Source #
The RTS flags the plan requires beyond the live posture, in GHCRTS syntax:
a -N when the capability count must change, a -M when a ceiling must be
enforced that is not already in force. Empty when the process is already running
the plan. A FromRts entry never contributes a flag (it is the live posture).
renderRuntimePosture :: RuntimePlan -> RtsPosture -> [Text] Source #
The boot log's posture lines, one decision per line with its provenance, plus the allocation-area line (always RTS-sourced; it is deliberately not config-surfaced). Rendered from the plan, so the lines describe what the process runs with after the plan is applied.
Cgroup v2 parsing
parseCgroupSelfPath :: Text -> Maybe Text Source #
The process's cgroup-v2 path from a /proc/self/cgroup body: the 0::
line's path ("0::/a/b" yields "/a/b"). Nothing when no v2 entry is
present (a pure cgroup-v1 host).
ancestorPaths :: Text -> [Text] Source #
A cgroup path and its ancestors, leaf first, ending at the root (the empty
suffix): "/a/b" yields ["/a/b", "/a", ""]; the root path "/"
yields just [""].