| Safe Haskell | None |
|---|---|
| Language | GHC2021 |
Ecluse.Core.Security.Host
Description
Outbound-request guards for the proxy's data plane: defending where the proxy fetches.
Écluse builds outbound HTTP requests from two untrusted sources -- __client-supplied
package identifiers (the request path) and upstream-supplied artifact
locations__ (a packument's dist.tarball). This module provides the pure guard layer
that keeps the proxy from being steered by hostile input.
Where the proxy fetches: isAllowedUpstreamHost restricts outbound fetches
to the configured upstream hosts, and isBlockedTarget rejects internal address
ranges (cloud instance metadata, loopback, RFC1918) that the proxy's network
position can otherwise reach. Together they are the SSRF gate: a target must be
both on the allowlist and not an internal address.
Synopsis
- data LoweredHostSet
- lowerCaseHosts :: Set Text -> LoweredHostSet
- isAllowedUpstreamHost :: LoweredHostSet -> Text -> Bool
- isBlockedTarget :: [IPRange] -> Text -> Bool
- isBlockedIP :: [IPRange] -> IP -> Bool
- parseIpLiteral :: Text -> Maybe IpAddr
- parseBlockedRange :: Text -> Maybe IPRange
- hostAddress :: Text -> Text
- splitHostPort :: Text -> Maybe (Text, Text)
- data TarballHostPolicy
- data Origin
- tarballHostAllowed :: Origin -> TarballHostPolicy -> LoweredHostSet -> [IPRange] -> Text -> Text -> Bool
- data TarballHostGate = TarballHostGate {}
- tarballHostGate :: Text -> Text -> Text -> TarballHostGate
- isHex :: Text -> Bool
- isDecimal :: Text -> Bool
Outbound host allowlist
data LoweredHostSet Source #
A set of host strings normalised to lower case, the form the host guards
(isAllowedUpstreamHost and isBlockedTarget) compare against.
The type is opaque, and lowerCaseHosts is its only constructor: a value of
this type therefore carries the proof that every host in it is already
lower-cased, so the guards lower only the incoming host and the case-insensitive
match cannot be bypassed by an un-normalised configuration set.
Instances
| Show LoweredHostSet Source # | |
Defined in Ecluse.Core.Security.Host Methods showsPrec :: Int -> LoweredHostSet -> ShowS # show :: LoweredHostSet -> String # showList :: [LoweredHostSet] -> ShowS # | |
| Eq LoweredHostSet Source # | |
Defined in Ecluse.Core.Security.Host Methods (==) :: LoweredHostSet -> LoweredHostSet -> Bool # (/=) :: LoweredHostSet -> LoweredHostSet -> Bool # | |
lowerCaseHosts :: Set Text -> LoweredHostSet Source #
Normalise a set of configured host strings to the canonical key form the host
guards take, yielding a LoweredHostSet.
A plain DNS name is folded to lower case (hostnames are case-insensitive), so the
guards match an incoming host against the configuration regardless of how either
was spelled. An entry that parses as an IP literal is additionally rendered to its
single canonical literal (see canonicalHostKey), so equivalent spellings of one
address (compressed versus expanded IPv6, differing case) collapse to one key. An
operator who opts in 0:0:0:0:0:0:0:1 therefore matches a literal ::1 rather than
missing it on a textual difference.
isAllowedUpstreamHost :: LoweredHostSet -> Text -> Bool Source #
Whether host is one of the configured upstream hosts.
The first guard on every outbound fetch: the proxy talks to its configured
private/public upstreams and mirror target, and nothing else -- so a target
host derived from a packument's dist.tarball (or anywhere else) is fetched only
if it appears in allowed. The match is exact on the bare host (no port, no
scheme -- extract it with hostAddress first) and case-insensitive, since
DNS hostnames are; an empty host is never allowed. This is the allowlist half
of the SSRF gate; pair it with isBlockedTarget for the internal-range half.
The allowlist is a LoweredHostSet, so it is already normalised and only the
incoming host is folded here -- through the same canonicalHostKey the set was
built with, so an IP-literal entry matches regardless of how either side spells the
address.
Internal-range block
isBlockedTarget :: [IPRange] -> Text -> Bool Source #
Whether host is an internal address the proxy must not fetch.
A proxy sits in a privileged network position, so an attacker who can steer a
fetch (see the module header) aims it at addresses only the proxy can reach: the
cloud instance-metadata endpoint (169.254.169.254), loopback, or the private
network (RFC1918). This blocks, by parsing host as a literal IP and testing it
against:
- link-local
169.254.0.0/16(which contains the169.254.169.254metadata address) and IPv6fe80::/10; - loopback
127.0.0.0/8and IPv6::1; - unspecified / this-host
0.0.0.0/8and IPv6::--0.0.0.0is not a no-op target: on Linux a connect to it reaches a loopback-bound service, so it is a loopback-equivalent that must be blocked alongside127.0.0.0/8; - RFC1918 private
10.0.0.0/8,172.16.0.0/12, and192.168.0.0/16; - CGNAT shared
100.64.0.0/10(RFC 6598) -- carrier-grade NAT space some cloud fabrics route internally; - IPv6 unique-local
fc00::/7(RFC 4193) -- the private-network IPv6 analogue, which contains the AWS IMDSv6 metadata endpointfd00:ec2::254; - every range in
additionalRanges, the operator-configured extension of this fixed set (ECLUSE_ADDITIONAL_BLOCKED_RANGES) -- a deployment's own internal space this module cannot know about in advance.
A host that is not an IP literal (a DNS name) is not blocked here:
name-based targets are constrained by the isAllowedUpstreamHost allowlist
instead, and post-resolution IP filtering belongs to the resolving fetch layer,
not this pure check. Both guards apply -- an allowlisted host that resolves to an
internal literal is still caught when its address is tested here.
isBlockedIP :: [IPRange] -> IP -> Bool Source #
Whether an IP falls in a blocked internal range: the fixed blockedRanges
set together with the caller-supplied additionalRanges.
The single source of record for the internal-range decision, used by the literal
block (isBlockedTarget) on the dist.tarball host gate. An
IPv4-mapped IPv6 address (::ffff:a.b.c.d) is first decoded to its embedded IPv4
and tested against the IPv4 ranges: a mapped internal literal (e.g.
::ffff:169.254.169.254) is a recognised SSRF smuggling form, so it must be
caught by the IPv4 block rather than slip through as an unrelated IPv6 address.
parseIpLiteral :: Text -> Maybe IpAddr Source #
Parse a host as an IP literal, or Nothing for a DNS name. Handles dotted-
quad IPv4 and the IPv6 forms a host realistically carries -- full eight-group form,
::-compressed forms (including ::1), and a trailing embedded IPv4 (the
a.b.c.d in ::ffff:a.b.c.d) -- which is enough to recognise the loopback,
link-local, and IPv4-mapped addresses isBlockedIP blocks. It is deliberately
not a complete IPv6 parser (no zone ids); an unrecognised literal is treated
as a name, which the host allowlist still constrains.
Only range membership is delegated to iproute (isBlockedIP); recognising
the literal stays hand-rolled on purpose. This recogniser is deliberately
lenient on the IPv4 dotted-quad: it accepts the ambiguous octet spellings a
strict IP library rejects and coerces each octet exactly as inet_aton -- and
hence a libc resolver -- does, so the block tests the address that would actually be
dialled. A 0x/0X-prefixed octet is hexadecimal, a leading-zero octet is
octal, and anything else is decimal. A leading-zero octet is therefore not
its decimal digits: 0012.0.0.1 is octal 10.0.0.1 (RFC1918, blocked), whereas
010.0.0.1 is octal 8.0.0.1 and 0127.0.0.1 is octal 87.0.0.1 (both public,
not blocked) -- matching the resolver rather than a decimal misreading. A stricter
parser that rejected these spellings would let an octal/hex spelling of an
internal address skip the block and reach the resolving fetch as a name, silently
narrowing the SSRF gate.
Two boundaries are deliberately not modelled here; such a host is simply treated as a
name, which the host allowlist constrains. First, the short inet_aton forms with
fewer than four parts (a bare 32-bit number 2130706433 / 0x7f000001, or a 127.1)
are not literals here. Second, a malformed octet (an invalid-octal 08, where 8 is not
an octal digit, or an overflowing 0400/256/0x100) is not a literal, exactly as a
resolver rejects it. A malformed IPv6 group that overflows 16 bits (fe80::1ffff) is
likewise not a literal here. Delegating literal parsing to a library would change this
lenient/strict boundary, so it is kept here.
parseBlockedRange :: Text -> Maybe IPRange Source #
Parse one operator-configured ECLUSE_ADDITIONAL_BLOCKED_RANGES entry (a
single CIDR, e.g. "203.0.113.0/24" or "2001:db8::/32") into an IPRange, or
Nothing for anything malformed.
A total wrapper over iproute's own Read instance for IPRange: that
instance's underlying parser (parseIPRange) already fails by returning no
parse rather than calling error, so readMaybe over it is safe -- unlike the
partial IsString instance (blockedRanges relies on for its own compile-time
literals, where a malformed literal would be a build-time error, never runtime
input). This is the only way the config decoder is meant to turn operator text
into an IPRange: a malformed entry must fail closed at boot, never be silently
dropped or accepted as an unblocked range.
hostAddress :: Text -> Text Source #
Extract the bare host from a URI or host[:port] authority.
A convenience for the SSRF gate: an outbound target is usually a full URL or an
authority, but isAllowedUpstreamHost and isBlockedTarget compare the bare
host. This strips a scheme:// prefix, any userinfo@, any :port suffix,
and any /path/?query/#fragment tail, lower-casing the result. It is a
pragmatic extractor for comparison, not a full RFC 3986 parser; a value with
no recognisable host yields the empty string, which both guards treat as
not-allowed. IPv6 literals in brackets ([::1]:443) are returned without the
brackets -- the bracket-aware host[:port] split is splitHostPort, shared with
the SQS endpoint parser so the two cannot drift on an authority edge case; a
malformed authority (an opening bracket with no close) yields the empty string,
the same fail-safe the guards apply to it.
splitHostPort :: Text -> Maybe (Text, Text) Source #
Split a host[:port] authority into its bare host and the raw ":port"
remainder (empty when no port is present), bracket-aware so an IPv6 literal's
inner colons are never mistaken for the port separator.
The single canonical authority split feeding both the data-plane host extractor
(hostAddress) and the SQS endpoint parser (parseEndpointUrl),
so the two re-implementations the [::1]:port edge cases tripped on cannot drift
again. A […] IPv6 literal is split on its closing bracket -- the host is returned
without the brackets and the remainder is whatever follows (a ":port" or empty) --
so an inner :: is never read as the port separator; a bare authority is split on
its first . An opening bracket with no close is a malformed authority and
yields :Nothing, which hostAddress folds to the empty (not-allowed) host and the
endpoint parser surfaces as a malformed-URL boot error.
Tarball-host policy
data TarballHostPolicy Source #
Whether a tarball may be fetched from a host that differs from the upstream that served the packument.
An upstream's dist.tarball is server-chosen data (see
docs/architecture/security.md → "Why dist.tarball is honoured"), so a
compromised or hostile upstream can name any host as the artifact location.
This policy bounds the axis of that risk the host allowlist leaves open: where the
bytes are fetched. Even an allowlisted-but-different host is a wider fetch surface than
the packument's own source, and the safe reading of the allowlist is "same source unless
told otherwise".
Constructors
| SameHostAsPackument | The secure default: a tarball is fetched only from the same host
that served the packument; a |
| AnyAllowlistedHost | The opt-in: a tarball may be fetched from any allowlisted host (for a registry that legitimately serves artifacts from a separate CDN/files host). This widens the fetch surface to the whole allowlist; it never escapes it or the internal-range block. |
Instances
| Show TarballHostPolicy Source # | |
Defined in Ecluse.Core.Security.Host Methods showsPrec :: Int -> TarballHostPolicy -> ShowS # show :: TarballHostPolicy -> String # showList :: [TarballHostPolicy] -> ShowS # | |
| Eq TarballHostPolicy Source # | |
Defined in Ecluse.Core.Security.Host Methods (==) :: TarballHostPolicy -> TarballHostPolicy -> Bool # (/=) :: TarballHostPolicy -> TarballHostPolicy -> Bool # | |
The trust of the origin a dist.tarball is being served from: the
operator-configured private upstream is TrustedOrigin, and the public upstream,
together with every artifact location an attacker could influence, is UntrustedOrigin.
The distinction governs the literal internal-range block alone (the cheap pure
defence-in-depth on the host gate). The trusted private origin is deliberately exempt
from it: a private registry may legitimately live on an internal address, and only an
untrusted target can be steered there. It never relaxes the host allowlist or the
same-host clause, which gate both origins identically, so a trusted origin's
dist.tarball is still constrained to its own allowlisted host.
Constructors
| TrustedOrigin | The operator-configured private upstream: exempt from the literal internal-range block. |
| UntrustedOrigin | The public upstream, and any attacker-influenceable target: subject to the literal internal-range block. |
Instances
Arguments
| :: Origin | |
| -> TarballHostPolicy | |
| -> LoweredHostSet | The host allowlist (the same one every outbound fetch is gated by). |
| -> [IPRange] | The operator-configured ranges extending the fixed internal-range block (untrusted origin). |
| -> Text | The bare host that served the packument. |
| -> Text | The bare host of the candidate |
| -> Bool |
Whether a dist.tarball host may be fetched, given the origin's trust, the
policy, the host that served the packument, and the configured guards.
This is the policy half of the dist.tarball defence; it never replaces the host
allowlist or the literal internal-range block but composes on top of them, so the
answer is the conjunction of three independent checks and over-blocking is the
fail-safe:
- the
tarballHostmust be on the host allowlist (allowed), as every outbound target is: adist.tarballhost off the allowlist is refused regardless of policy; - it must not be an internal-address literal (the fixed range set plus the
operator-configured
additionalBlockedRanges), the cheap pure defence-in-depth, but aTrustedOriginis exempt from this clause (seeOrigin); and - under
SameHostAsPackument(the secure default) it must additionally equal thepackumentHost(the host that served the metadata), so a tarball on a different host is refused even when that host is allowlisted. UnderAnyAllowlistedHostthat last clause is relaxed, leaving only the allowlist and (origin-aware) internal-range checks.
The allowlist and same-host clauses gate both origins identically; only the
internal-range clause is origin-aware, so a TrustedOrigin is never let past its own
allowlisted host or onto a different host than its metadata under the default.
Hosts are compared by their canonical key (case-folded, and for an IP-literal the
single canonical literal; see canonicalHostKey), as the host guards are. An
empty tarballHost is never allowed (the allowlist already refuses it). The
packumentHost is the bare host the metadata was fetched from (extract it with
hostAddress); only its equality to tarballHost matters, so it need not itself
be re-validated here: it was already gated when the packument was fetched.
data TarballHostGate Source #
The mount-constant inputs to the per-request tarballHostAllowed gate, extracted
once from a mount's three configured upstream URLs so the serve path parses no URL
and builds no host set per request.
The serve-path tarball gate is on the hot artifact path (every private hit and every
public leg runs it), yet its allowlist and the private/public upstream hosts never
change after boot -- they are fixed by the mount's configuration. Recovering them from
the base URLs on each request rebuilt a LoweredHostSet and re-ran hostAddress several
times per artifact; precomputing them here into a TarballHostGate collapses that to a
few field reads. The only genuinely per-request host is the dynamic public
dist.tarball, still parsed at the call site.
Constructors
| TarballHostGate | |
Fields
| |
Instances
| Show TarballHostGate Source # | |
Defined in Ecluse.Core.Security.Host Methods showsPrec :: Int -> TarballHostGate -> ShowS # show :: TarballHostGate -> String # showList :: [TarballHostGate] -> ShowS # | |
| Eq TarballHostGate Source # | |
Defined in Ecluse.Core.Security.Host Methods (==) :: TarballHostGate -> TarballHostGate -> Bool # (/=) :: TarballHostGate -> TarballHostGate -> Bool # | |
tarballHostGate :: Text -> Text -> Text -> TarballHostGate Source #
Build the TarballHostGate from a mount's private, public, and mirror-target
upstream URLs: the allowlist is the lowered set of their bare hosts, and the private and
public hosts are each extracted once with hostAddress. Called once per mount at the
composition root (and by test fixtures); the result is carried on the serve
dependencies so the per-request gate reads fields rather than re-parsing URLs.