ecluse:ecluse-core
Safe HaskellNone
LanguageGHC2021

Ecluse.Core.Security.Host

Description

Outbound-request guards for the proxy's data plane: defending where the proxy fetches.

Écluse builds outbound HTTP requests from two untrusted sources -- __client-supplied package identifiers (the request path) and upstream-supplied artifact locations__ (a packument's dist.tarball). This module provides the pure guard layer that keeps the proxy from being steered by hostile input.

Where the proxy fetches: isAllowedUpstreamHost restricts outbound fetches to the configured upstream hosts, and isBlockedTarget rejects internal address ranges (cloud instance metadata, loopback, RFC1918) that the proxy's network position can otherwise reach. Together they are the SSRF gate: a target must be both on the allowlist and not an internal address.

Synopsis

Outbound host allowlist

data LoweredHostSet Source #

A set of host strings normalised to lower case, the form the host guards (isAllowedUpstreamHost and isBlockedTarget) compare against.

The type is opaque, and lowerCaseHosts is its only constructor: a value of this type therefore carries the proof that every host in it is already lower-cased, so the guards lower only the incoming host and the case-insensitive match cannot be bypassed by an un-normalised configuration set.

lowerCaseHosts :: Set Text -> LoweredHostSet Source #

Normalise a set of configured host strings to the canonical key form the host guards take, yielding a LoweredHostSet.

A plain DNS name is folded to lower case (hostnames are case-insensitive), so the guards match an incoming host against the configuration regardless of how either was spelled. An entry that parses as an IP literal is additionally rendered to its single canonical literal (see canonicalHostKey), so equivalent spellings of one address (compressed versus expanded IPv6, differing case) collapse to one key. An operator who opts in 0:0:0:0:0:0:0:1 therefore matches a literal ::1 rather than missing it on a textual difference.

isAllowedUpstreamHost :: LoweredHostSet -> Text -> Bool Source #

Whether host is one of the configured upstream hosts.

The first guard on every outbound fetch: the proxy talks to its configured private/public upstreams and mirror target, and nothing else -- so a target host derived from a packument's dist.tarball (or anywhere else) is fetched only if it appears in allowed. The match is exact on the bare host (no port, no scheme -- extract it with hostAddress first) and case-insensitive, since DNS hostnames are; an empty host is never allowed. This is the allowlist half of the SSRF gate; pair it with isBlockedTarget for the internal-range half.

The allowlist is a LoweredHostSet, so it is already normalised and only the incoming host is folded here -- through the same canonicalHostKey the set was built with, so an IP-literal entry matches regardless of how either side spells the address.

Internal-range block

isBlockedTarget :: [IPRange] -> Text -> Bool Source #

Whether host is an internal address the proxy must not fetch.

A proxy sits in a privileged network position, so an attacker who can steer a fetch (see the module header) aims it at addresses only the proxy can reach: the cloud instance-metadata endpoint (169.254.169.254), loopback, or the private network (RFC1918). This blocks, by parsing host as a literal IP and testing it against:

  • link-local 169.254.0.0/16 (which contains the 169.254.169.254 metadata address) and IPv6 fe80::/10;
  • loopback 127.0.0.0/8 and IPv6 ::1;
  • unspecified / this-host 0.0.0.0/8 and IPv6 :: -- 0.0.0.0 is not a no-op target: on Linux a connect to it reaches a loopback-bound service, so it is a loopback-equivalent that must be blocked alongside 127.0.0.0/8;
  • RFC1918 private 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16;
  • CGNAT shared 100.64.0.0/10 (RFC 6598) -- carrier-grade NAT space some cloud fabrics route internally;
  • IPv6 unique-local fc00::/7 (RFC 4193) -- the private-network IPv6 analogue, which contains the AWS IMDSv6 metadata endpoint fd00:ec2::254;
  • every range in additionalRanges, the operator-configured extension of this fixed set (ECLUSE_ADDITIONAL_BLOCKED_RANGES) -- a deployment's own internal space this module cannot know about in advance.

A host that is not an IP literal (a DNS name) is not blocked here: name-based targets are constrained by the isAllowedUpstreamHost allowlist instead, and post-resolution IP filtering belongs to the resolving fetch layer, not this pure check. Both guards apply -- an allowlisted host that resolves to an internal literal is still caught when its address is tested here.

isBlockedIP :: [IPRange] -> IP -> Bool Source #

Whether an IP falls in a blocked internal range: the fixed blockedRanges set together with the caller-supplied additionalRanges.

The single source of record for the internal-range decision, used by the literal block (isBlockedTarget) on the dist.tarball host gate. An IPv4-mapped IPv6 address (::ffff:a.b.c.d) is first decoded to its embedded IPv4 and tested against the IPv4 ranges: a mapped internal literal (e.g. ::ffff:169.254.169.254) is a recognised SSRF smuggling form, so it must be caught by the IPv4 block rather than slip through as an unrelated IPv6 address.

parseIpLiteral :: Text -> Maybe IpAddr Source #

Parse a host as an IP literal, or Nothing for a DNS name. Handles dotted- quad IPv4 and the IPv6 forms a host realistically carries -- full eight-group form, ::-compressed forms (including ::1), and a trailing embedded IPv4 (the a.b.c.d in ::ffff:a.b.c.d) -- which is enough to recognise the loopback, link-local, and IPv4-mapped addresses isBlockedIP blocks. It is deliberately not a complete IPv6 parser (no zone ids); an unrecognised literal is treated as a name, which the host allowlist still constrains.

Only range membership is delegated to iproute (isBlockedIP); recognising the literal stays hand-rolled on purpose. This recogniser is deliberately lenient on the IPv4 dotted-quad: it accepts the ambiguous octet spellings a strict IP library rejects and coerces each octet exactly as inet_aton -- and hence a libc resolver -- does, so the block tests the address that would actually be dialled. A 0x/0X-prefixed octet is hexadecimal, a leading-zero octet is octal, and anything else is decimal. A leading-zero octet is therefore not its decimal digits: 0012.0.0.1 is octal 10.0.0.1 (RFC1918, blocked), whereas 010.0.0.1 is octal 8.0.0.1 and 0127.0.0.1 is octal 87.0.0.1 (both public, not blocked) -- matching the resolver rather than a decimal misreading. A stricter parser that rejected these spellings would let an octal/hex spelling of an internal address skip the block and reach the resolving fetch as a name, silently narrowing the SSRF gate.

Two boundaries are deliberately not modelled here; such a host is simply treated as a name, which the host allowlist constrains. First, the short inet_aton forms with fewer than four parts (a bare 32-bit number 2130706433 / 0x7f000001, or a 127.1) are not literals here. Second, a malformed octet (an invalid-octal 08, where 8 is not an octal digit, or an overflowing 0400/256/0x100) is not a literal, exactly as a resolver rejects it. A malformed IPv6 group that overflows 16 bits (fe80::1ffff) is likewise not a literal here. Delegating literal parsing to a library would change this lenient/strict boundary, so it is kept here.

parseBlockedRange :: Text -> Maybe IPRange Source #

Parse one operator-configured ECLUSE_ADDITIONAL_BLOCKED_RANGES entry (a single CIDR, e.g. "203.0.113.0/24" or "2001:db8::/32") into an IPRange, or Nothing for anything malformed.

A total wrapper over iproute's own Read instance for IPRange: that instance's underlying parser (parseIPRange) already fails by returning no parse rather than calling error, so readMaybe over it is safe -- unlike the partial IsString instance (blockedRanges relies on for its own compile-time literals, where a malformed literal would be a build-time error, never runtime input). This is the only way the config decoder is meant to turn operator text into an IPRange: a malformed entry must fail closed at boot, never be silently dropped or accepted as an unblocked range.

hostAddress :: Text -> Text Source #

Extract the bare host from a URI or host[:port] authority.

A convenience for the SSRF gate: an outbound target is usually a full URL or an authority, but isAllowedUpstreamHost and isBlockedTarget compare the bare host. This strips a scheme:// prefix, any userinfo@, any :port suffix, and any /path/?query/#fragment tail, lower-casing the result. It is a pragmatic extractor for comparison, not a full RFC 3986 parser; a value with no recognisable host yields the empty string, which both guards treat as not-allowed. IPv6 literals in brackets ([::1]:443) are returned without the brackets -- the bracket-aware host[:port] split is splitHostPort, shared with the SQS endpoint parser so the two cannot drift on an authority edge case; a malformed authority (an opening bracket with no close) yields the empty string, the same fail-safe the guards apply to it.

splitHostPort :: Text -> Maybe (Text, Text) Source #

Split a host[:port] authority into its bare host and the raw ":port" remainder (empty when no port is present), bracket-aware so an IPv6 literal's inner colons are never mistaken for the port separator.

The single canonical authority split feeding both the data-plane host extractor (hostAddress) and the SQS endpoint parser (parseEndpointUrl), so the two re-implementations the [::1]:port edge cases tripped on cannot drift again. A […] IPv6 literal is split on its closing bracket -- the host is returned without the brackets and the remainder is whatever follows (a ":port" or empty) -- so an inner :: is never read as the port separator; a bare authority is split on its first :. An opening bracket with no close is a malformed authority and yields Nothing, which hostAddress folds to the empty (not-allowed) host and the endpoint parser surfaces as a malformed-URL boot error.

Tarball-host policy

data TarballHostPolicy Source #

Whether a tarball may be fetched from a host that differs from the upstream that served the packument.

An upstream's dist.tarball is server-chosen data (see docs/architecture/security.md → "Why dist.tarball is honoured"), so a compromised or hostile upstream can name any host as the artifact location. This policy bounds the axis of that risk the host allowlist leaves open: where the bytes are fetched. Even an allowlisted-but-different host is a wider fetch surface than the packument's own source, and the safe reading of the allowlist is "same source unless told otherwise".

Constructors

SameHostAsPackument

The secure default: a tarball is fetched only from the same host that served the packument; a dist.tarball on any other host is refused, even one otherwise on the allowlist.

AnyAllowlistedHost

The opt-in: a tarball may be fetched from any allowlisted host (for a registry that legitimately serves artifacts from a separate CDN/files host). This widens the fetch surface to the whole allowlist; it never escapes it or the internal-range block.

data Origin Source #

The trust of the origin a dist.tarball is being served from: the operator-configured private upstream is TrustedOrigin, and the public upstream, together with every artifact location an attacker could influence, is UntrustedOrigin.

The distinction governs the literal internal-range block alone (the cheap pure defence-in-depth on the host gate). The trusted private origin is deliberately exempt from it: a private registry may legitimately live on an internal address, and only an untrusted target can be steered there. It never relaxes the host allowlist or the same-host clause, which gate both origins identically, so a trusted origin's dist.tarball is still constrained to its own allowlisted host.

Constructors

TrustedOrigin

The operator-configured private upstream: exempt from the literal internal-range block.

UntrustedOrigin

The public upstream, and any attacker-influenceable target: subject to the literal internal-range block.

Instances

Instances details
Show Origin Source # 
Instance details

Defined in Ecluse.Core.Security.Host

Eq Origin Source # 
Instance details

Defined in Ecluse.Core.Security.Host

Methods

(==) :: Origin -> Origin -> Bool #

(/=) :: Origin -> Origin -> Bool #

tarballHostAllowed Source #

Arguments

:: Origin 
-> TarballHostPolicy 
-> LoweredHostSet

The host allowlist (the same one every outbound fetch is gated by).

-> [IPRange]

The operator-configured ranges extending the fixed internal-range block (untrusted origin).

-> Text

The bare host that served the packument.

-> Text

The bare host of the candidate dist.tarball.

-> Bool 

Whether a dist.tarball host may be fetched, given the origin's trust, the policy, the host that served the packument, and the configured guards.

This is the policy half of the dist.tarball defence; it never replaces the host allowlist or the literal internal-range block but composes on top of them, so the answer is the conjunction of three independent checks and over-blocking is the fail-safe:

  • the tarballHost must be on the host allowlist (allowed), as every outbound target is: a dist.tarball host off the allowlist is refused regardless of policy;
  • it must not be an internal-address literal (the fixed range set plus the operator-configured additionalBlockedRanges), the cheap pure defence-in-depth, but a TrustedOrigin is exempt from this clause (see Origin); and
  • under SameHostAsPackument (the secure default) it must additionally equal the packumentHost (the host that served the metadata), so a tarball on a different host is refused even when that host is allowlisted. Under AnyAllowlistedHost that last clause is relaxed, leaving only the allowlist and (origin-aware) internal-range checks.

The allowlist and same-host clauses gate both origins identically; only the internal-range clause is origin-aware, so a TrustedOrigin is never let past its own allowlisted host or onto a different host than its metadata under the default.

Hosts are compared by their canonical key (case-folded, and for an IP-literal the single canonical literal; see canonicalHostKey), as the host guards are. An empty tarballHost is never allowed (the allowlist already refuses it). The packumentHost is the bare host the metadata was fetched from (extract it with hostAddress); only its equality to tarballHost matters, so it need not itself be re-validated here: it was already gated when the packument was fetched.

data TarballHostGate Source #

The mount-constant inputs to the per-request tarballHostAllowed gate, extracted once from a mount's three configured upstream URLs so the serve path parses no URL and builds no host set per request.

The serve-path tarball gate is on the hot artifact path (every private hit and every public leg runs it), yet its allowlist and the private/public upstream hosts never change after boot -- they are fixed by the mount's configuration. Recovering them from the base URLs on each request rebuilt a LoweredHostSet and re-ran hostAddress several times per artifact; precomputing them here into a TarballHostGate collapses that to a few field reads. The only genuinely per-request host is the dynamic public dist.tarball, still parsed at the call site.

Constructors

TarballHostGate 

Fields

  • thgAllowlist :: LoweredHostSet

    The lowered allowlist of the mount's configured upstream hosts (public, private, and mirror target) -- the same set every outbound fetch is gated against (security.md invariant 2).

  • thgPrivateHost :: Text

    The bare host of the private upstream, extracted once.

  • thgPublicHost :: Text

    The bare host of the public upstream, extracted once.

tarballHostGate :: Text -> Text -> Text -> TarballHostGate Source #

Build the TarballHostGate from a mount's private, public, and mirror-target upstream URLs: the allowlist is the lowered set of their bare hosts, and the private and public hosts are each extracted once with hostAddress. Called once per mount at the composition root (and by test fixtures); the result is carried on the serve dependencies so the per-request gate reads fields rather than re-parsing URLs.

Internal for testing