ecluse:ecluse-core
Safe HaskellNone
LanguageGHC2021

Ecluse.Core.Registry.Npm.Wire

Description

The npm registry wire JSON types and their lenient decoders.

This module is the npm protocol boundary: it models the JSON the registry actually sends and parses it with deliberately forgiving FromJSON instances. It is the raw-wire layer of "parse, don't validate" -- it captures /what the registry said/ as faithfully as the rules and serving need, and __nothing more__. Projecting these wire types into the ecosystem-agnostic domain model (Ecluse.Core.Package: PackageDetails et al.) is a separate concern; keeping the two apart is what keeps the lenient/faithful handle clean.

The shapes here are reverse-engineered from live captures of registry.npmjs.org; the authoritative reference (with real bodies) is docs/research/reverse-engineering/npm.md (§4 full packument, §5 abbreviated, §7 dist, §11 type model, §3 errors).

Lenient on input

The public registry has drifted from its own spec and is inconsistent across endpoints, so every decoder here is forgiving in five specific ways, matching the documented reality:

  • Unknown keys are ignored. Manifests carry arbitrary author keys (gitHead, exports, tool-config blocks like is-odd's verb) and registry bookkeeping (_npmOperationalInternal); a decoder must not choke on them. aeson's record decoders already ignore extra keys, so this falls out of using (.:?)/(.:) rather than enumerating the whole object.
  • String-or-object scalars. license, bugs, repository, and the author/maintainer person fields each arrive as either a bare string or an object, depending on the package's age and tooling. Each corresponding type (License, Bugs, Repository, Person) therefore parses both shapes.
  • The bare-string error body. npm's per-version 404 is a bare JSON string ("version not found: ^3.0.0"), not the documented {error|message} object. ErrorResponse tolerates both.
  • The string-or-boolean deprecated flag. deprecated is conventionally the deprecation message string, but some published versions carry a boolean instead (true = deprecated without a message, false = not deprecated). vmDeprecated reads every form, so a boolean never fails the whole packument decode (a real packument such as react's mixes the string and boolean forms across versions).
  • Advisory dist sub-fields degrade rather than deny. fileCount, unpackedSize, and signatures are advisory -- they decide no rule and no serve -- so a hostile value in one (a fractional/huge/Int-overflowing number, a wrong-typed field, or a malformed/non-array signatures) reads as absent/empty rather than failing the version. One poisoned value therefore cannot deny the whole packument (Dist).

Faithful on the rule-decisive fields

The fields the rules engine and the serving path actually need are captured precisely: the abbreviated-only vmHasInstallScript flag, the vmDeprecated notice, the whole vmScripts map (so the full form's install-script presence can be derived -- the full manifest has no hasInstallScript key), the Dist integrity triple (tarball/shasum/integrity), and the full packument's pkmtTime map (the source of truth for publish age, which the abbreviated form drops).

Only the decode path (FromJSON) is modelled here.

Synopsis

Shared scalars

data Person Source #

A person associated with a package -- an author, maintainer, contributor, or the per-version publisher (_npmUser).

Lenient: npm sends a person as either an object {name, email?, url?} or a single packed string of the conventional form "Name <email> (url)". The packed form is captured verbatim in personName (with personEmail/personUrl left Nothing); this wire layer does not attempt to split it, leaving that to the domain projection if it is ever needed. Distinct from Ecluse.Core.Package's domain Person -- this is the raw wire shape.

Constructors

Person 

Fields

  • personName :: Text

    The person's name. For the packed-string form, the entire string as sent (e.g. "Mikeal Rogers <mikeal@example.com>").

  • personEmail :: Maybe Text

    Their email address, if given as an object field.

  • personUrl :: Maybe Text

    A homepage / profile URL, if given as an object field.

Instances

Instances details
FromJSON Person Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Show Person Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Eq Person Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

(==) :: Person -> Person -> Bool #

(/=) :: Person -> Person -> Bool #

Ord Person Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

data Repository Source #

An SCM location for a package.

Lenient: npm sends repository as either an object {type?, url} or a bare string (a shorthand URL such as "github:user/repo"). Both are captured; the bare-string form fills repoUrl and leaves repoType Nothing.

Constructors

Repository 

Fields

data Bugs Source #

The issue tracker for a package.

Lenient: npm sends bugs as either an object {url?, email?} or a bare string (just the tracker URL). The bare-string form fills bugsUrl.

Constructors

Bugs 

Fields

Instances

Instances details
FromJSON Bugs Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Show Bugs Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

showsPrec :: Int -> Bugs -> ShowS #

show :: Bugs -> String #

showList :: [Bugs] -> ShowS #

Eq Bugs Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

(==) :: Bugs -> Bugs -> Bool #

(/=) :: Bugs -> Bugs -> Bool #

Ord Bugs Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

compare :: Bugs -> Bugs -> Ordering #

(<) :: Bugs -> Bugs -> Bool #

(<=) :: Bugs -> Bugs -> Bool #

(>) :: Bugs -> Bugs -> Bool #

(>=) :: Bugs -> Bugs -> Bool #

max :: Bugs -> Bugs -> Bugs #

min :: Bugs -> Bugs -> Bugs #

data License Source #

A declared license.

Lenient: modern packages send a bare SPDX string (MIT); legacy packages send an object {type, url?}. Both are preserved as a sum so the distinction is not lost: LicenseSpdx for the string, LicenseObject for the legacy object.

Constructors

LicenseSpdx Text

An SPDX expression or identifier, sent as a bare string (MIT, "Apache-2.0", "(MIT OR Apache-2.0)"). The modern form.

LicenseObject Text (Maybe Text)

The legacy object form {type, url?}: a license name plus an optional URL to the license text.

The dist object

data Dist Source #

The dist object: the artifact descriptor carried by every version manifest (full and abbreviated). It is the gateway to the tarball bytes and the integrity guarantee.

The integrity triple (distTarball, distShasum, distIntegrity) is rule-decisive and serving-decisive -- a client fails the install if the downloaded bytes do not match integrity/shasum, so any mirror or URL rewrite must preserve these byte-for-byte. Prefer distIntegrity (SRI) over the legacy SHA-1 distShasum.

The remaining sub-fields (distFileCount, distUnpackedSize, distSignatures) are advisory -- they inform reporting but decide no rule and no serve -- and so are decoded leniently: a present-but-undecodable number (fractional, huge, or Int-overflowing) reads as absent (Nothing), a malformed signature element is skipped rather than failing the array, and a signatures value that is not even an array reads as empty. A hostile value in one version therefore degrades that field alone, never denying the whole packument.

Constructors

Dist 

Fields

Instances

Instances details
FromJSON Dist Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Show Dist Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

showsPrec :: Int -> Dist -> ShowS #

show :: Dist -> String #

showList :: [Dist] -> ShowS #

Eq Dist Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

(==) :: Dist -> Dist -> Bool #

(/=) :: Dist -> Dist -> Bool #

Ord Dist Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Methods

compare :: Dist -> Dist -> Ordering #

(<) :: Dist -> Dist -> Bool #

(<=) :: Dist -> Dist -> Bool #

(>) :: Dist -> Dist -> Bool #

(>=) :: Dist -> Dist -> Bool #

max :: Dist -> Dist -> Dist #

min :: Dist -> Dist -> Dist #

data Signature Source #

One registry signature over a published artifact: an ECDSA signature and the id of the key that produced it. Verifiable against npm's published public keys (GET /-/npm/v1/keys) -- the basis of npm audit signatures.

Constructors

Signature 

Fields

  • sigSig :: Text

    The base64-encoded signature value.

  • sigKeyid :: Text

    The id of the signing key (e.g. "SHA256:jl3bwswu80…").

Per-version manifest

data VersionManifest Source #

A single version's manifest -- the per-version object that is essentially the package's package.json at publish time plus registry-injected fields. It appears three ways on the wire and this one type decodes all of them: embedded in a full Packument (versions[v]), embedded in an AbbreviatedPackument (a trimmed subset of the same shape), and standalone (GET /{pkg}/{version}).

Only the fields Écluse's rules and serving need are modelled; everything else is ignored (see the module header). The two rule-decisive optionals deserve note:

  • vmHasInstallScript is abbreviated-only -- the registry sets it when the version declares preinstall/install/postinstall scripts. It is the cleanest install-script signal, but it is absent from the full manifest.
  • vmScripts is therefore captured whole so that, when only the full form is available, install-script presence can be derived (scripts has any of preinstall/install/postinstall). That derivation is a domain-projection concern, not this layer's.

The publish timestamp is not here -- it lives in the packument's pkmtTime map, not the manifest (see §8 of the protocol reference).

Constructors

VersionManifest 

Fields

  • vmName :: Text

    The package name, possibly scoped ("@scope/name"), verbatim.

  • vmVersion :: Text

    The exact version string (e.g. "1.2.3"), kept opaque at this layer.

  • vmDist :: Dist

    The artifact descriptor (always present).

  • vmDeprecated :: Maybe Text

    The deprecation message when the version is deprecated, else Nothing. npm sends deprecated as the message string, or as a boolean (true = deprecated with no message, captured as ""; false = not deprecated); an absent, null, false, or otherwise-shaped value reads as Nothing.

  • vmHasInstallScript :: Maybe Bool

    Whether the version declares install scripts. Present in the abbreviated form only; Nothing in the full form (derive from vmScripts there).

  • vmScripts :: Map Text Text

    The scripts map (lifecycle name to command), empty when absent. The source for deriving install-script presence from the full form.

  • vmLicense :: Maybe License

    The declared license, if any (string or legacy object; see License).

    The manifest's dependency maps and maintainer list are __deliberately not parsed__: no rule or serve path consults them, the raw document relays them to the client untouched, and a heavy packument carries thousands of per-version entries of pure parse cost (architect ruling, 2026-07-02 -- including that a malformed entry there may degrade rather than deny). Restore them from history if a dependency-reading rule ever lands.

Packuments

data Packument Source #

The full packument: GET /{pkg} with Accept: application/json (or no Accept). One document describing the package and every published version.

The field that earns the full form its place in the pipeline is pkmtTime: the map of publish timestamps (created, modified, and one per version), the source of truth for publish age that age-based rules need. The abbreviated form (§5) drops it, keeping only a top-level modified. Package-level description/license/author are hoisted from the latest version for convenience; the authoritative copy is the per-version one in pkmtVersions.

_attachments is intentionally not modelled -- it is populated only on the publish document, not on reads.

Constructors

Packument 

Fields

data AbbreviatedPackument Source #

The abbreviated packument: GET /{pkg} with Accept: application/vnd.npm.install-v1+json. The install-optimised view and the one the proxy treats as primary.

It carries exactly four top-level fields. Notably the full time map is dropped (only a top-level apkmtModified remains), so publish-age rules need the full Packument. Its apkmtVersions manifests are the trimmed subset of VersionManifest -- the same type, with the install-only fields populated (including the abbreviated-only vmHasInstallScript).

Constructors

AbbreviatedPackument 

Fields

  • apkmtName :: Text

    The package name.

  • apkmtModified :: UTCTime

    Equivalent to the full form's time.modified; the only timestamp the abbreviated form carries.

  • apkmtDistTags :: Map Text Text

    The dist-tags map (tag to version), as in the full form.

  • apkmtVersions :: Map Text VersionManifest

    Every published version that decodes (abbreviated subset of fields), keyed by exact version string. As in the full form, a version whose manifest is malformed in a required field is dropped (see lenientVersionMap).

Errors

data ErrorResponse Source #

An npm error body.

Lenient: the documented shape is an object { message?, error?, ok?: false } and clients "should check for message, then error". But the registry is inconsistent -- its per-version 404 is a bare JSON string ("version not found: ^3.0.0"), not an object. This type tolerates both: the object form keeps its fields in an ErrorBody, and a bare string is captured whole as ErrorString. Read the human-facing reason via errorMessage, which applies npm's "message, then error" precedence across both shapes.

Constructors

ErrorObject ErrorBody

The documented object form { message?, error? }.

ErrorString Text

A bare JSON string body (npm's per-version 404), captured whole.

data ErrorBody Source #

The fields of npm's object-form error body. A product type (not inline constructor fields on ErrorResponse) so its selectors are total -- there is no ErrorString case for them to be partial over.

Constructors

ErrorBody 

Fields

Instances

Instances details
Show ErrorBody Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

Eq ErrorBody Source # 
Instance details

Defined in Ecluse.Core.Registry.Npm.Wire

errorMessage :: ErrorResponse -> Maybe Text Source #

The human-facing reason carried by an error body, applying npm's documented precedence: prefer message, then error, and for the bare-string form the string itself. Nothing only when an object form carried neither field. Total.