Supply Chain Hardening
Every container image Lithosphere publishes to GHCR carries three verifiable artifacts produced by .github/workflows/publish-images.yaml:
Cosign signature — keyless, identity-bound to the GitHub Actions workflow that built the image. Proves the image was produced by this repository, not a typo-squatter.
SLSA build-provenance attestation — in-toto SLSA Provenance v1.0 statement linking the image digest to the specific workflow run, source commit, and build invocation. Proves how the image was produced.
SBOM (SPDX) — Software Bill of Materials enumerating every package in the image. Required for CVE response and license-policy audits.
Together they satisfy SLSA Build Level 2 (signed provenance from a hosted build platform). Level 3 (non-falsifiable provenance) would require a hardened isolation layer beyond GitHub-hosted runners — out of scope for testnet posture.
Verifying a published image
Pick any tag on any image at ghcr.io/kajlabs/lithosphere-*. Three checks, in order of increasing strength:
1. Cosign signature
cosign verify ghcr.io/kajlabs/lithosphere-api:sha-<short> \
--certificate-identity-regexp 'https://github.com/KaJLabs/Lithosphere/.+' \
--certificate-oidc-issuer https://token.actions.githubusercontent.comSuccess output includes the Fulcio cert subject (workflow ref + actor + SHA) and the Rekor transparency log entry. A mismatch means either the image wasn't built by this repo, or the workflow was tampered with.
2. SLSA build provenance
gh attestation verify oci://ghcr.io/kajlabs/lithosphere-api:sha-<short> \
--owner KaJLabsThis validates the in-toto attestation pushed to the registry alongside the image. The output includes the build parameters, source repo + commit, and runner platform. Stricter than the Cosign check because the attestation includes a structured description of how the build happened, not just "this org signed it."
3. SBOM
SBOMs are uploaded as workflow artifacts (90-day retention). For CVE response: jq '.packages[] | select(.name == "<dep>")' finds whether a known-vulnerable package is in your image.
What each verification rules out
Typo-squatted image (lithosphere-apl instead of -api)
✅
✅
—
Image rebuilt offline + force-pushed to GHCR with a stolen token
✅
✅
—
Workflow modified to inject malicious code into the build
—
✅
—
Same Dockerfile, different source commit
—
✅
—
Same source commit, but pulled-in dep CVEd post-build
—
—
✅
The three are complementary; a serious compliance review checks all three.
How the gates layer
publish-images.yaml pipeline order:
Trivy runs before signing so a CVE-laden image never gets an attestation in the first place. The CRITICAL gate is hard-fail; HIGH findings upload to the GitHub Security tab for triage (see license policy for the parallel dependency-side gate).
Deployment-side verification
The Verify GHCR Image Signatures step in deploy-simple.yaml runs Cosign + gh attestation verify against each published image for the deploy's commit SHA. Today the bastion still builds from source, so the check is advisory (continue-on-error: true) — a failed or skipped row doesn't block the deploy. Results land in the deployment summary's ## Supply Chain Verification table next to the ## Build SHA Verification table.
Outcomes the step reports:
✅ verified
Cosign keyless signature AND SLSA Build L2 attestation both match https://github.com/KaJLabs/Lithosphere/.+ identity.
⚠️ partial (sig-only)
Image is signed but lacks (or has an invalid) SLSA attestation. Treat as suspicious.
⚠️ partial (att-only)
Attestation present but the Cosign signature didn't verify. Usually means a stale registry mirror or a key-rotation in progress.
❌ failed
Neither artifact verified. Investigate before promoting.
— (skipped)
No image for this SHA. Normal when the commit only touched workflows / docs / chain — publish-images.yaml runs only on Makalu/{api,indexer,explorer}/** path changes.
Promotion path to a blocking gate
The day deployments switch from build-from-source to pull-published-image:
Remove
continue-on-error: truefromVerify GHCR Image Signatures.Change the bastion deploy script to
docker compose pullthendocker compose up -d --no-build.The verification step now blocks any deploy whose images don't carry matching Cosign + SLSA artifacts. Tampered, typo-squatted, or offline-rebuilt images can't reach production.
Until then, the advisory mode is the right posture: it surfaces the signal in every deploy summary without inheriting publish-image's schedule constraints.
Rotation & key management
Cosign keyless avoids holding a long-lived signing key — every signature is bound to a short-lived Fulcio certificate issued during the workflow run. The trust anchor is the GitHub Actions OIDC issuer + the KaJLabs/Lithosphere repo identity. No key to rotate, no key to leak.
For npm package publishes (SDK), the equivalent identity binding is npm provenance attestations — wired in release.yaml via --provenance. See release-process.md.
Code-level static analysis
Image-layer attestations and dependency-license enforcement don't catch bugs in the source we own — SQL injection, SSRF, path traversal, weak crypto, prototype pollution. That's what codeql.yaml covers: GitHub's CodeQL runs on every push to main, every PR, and weekly via cron. The JS/TS extractor (build-mode none, query suite security-and-quality) indexes Makalu/{api,indexer,explorer,packages,templates,contracts/scripts,tooling} plus repo-level scripts/. Findings post to the Security tab under the codeql-javascript-typescript category, alongside Trivy's trivy-{api,indexer,explorer} entries.
The three layers compose:
Source SAST
bugs in code we wrote
codeql.yaml
Container scan
OS/library CVEs
publish-images.yaml (Trivy)
Supply chain
image tampering / typo-squat
publish-images.yaml (Cosign + SLSA + SBOM)
Triage workflow for CodeQL findings
CodeQL's first-scan baseline often contains a long tail of style-level notes plus a handful of legitimate-but-false-positive flow alerts (e.g. router.push(\/blocks/${userInput}`)flagged as DOM-XSS because the extractor can't prove the destination route doesn'tinnerHTML` the segment). The expectation is not "zero open alerts" — it's "every open alert has been triaged":
Fix at source when the alert points at a genuine issue. Recent examples (commit landing this section):
js/log-injectionfromconsole.warnwith raw user input →sanitizeForLog()helper that strips ASCII control chars;js/file-system-racefrom anexistsSync→appendFileSyncpair → drop the pre-check (the append creates on demand);js/polynomial-redoson/=+$/→ manual trailing-char strip with no regex.Dismiss with a comment when the alert is a false positive. Use the GitHub Security UI ("Dismiss alert" → "False positive" / "Used in tests" / "Won't fix") and include the reason. Don't leave open alerts indefinitely without dismissal — they create noise that masks real findings.
Track as work when the alert is real but the fix needs design (e.g. SSRF in a controlled-base proxy endpoint — needs an explicit URL allow-list). Create an issue, link the alert, leave the alert open until the issue closes.
Cadence: triage every Monday alongside the weekly CodeQL cron run.
Related
License Policy — dependency-side supply chain
Key Rotation Runbook — for the cases where rotation IS required (RPC keys, deployer private keys, NPM_TOKEN)
Deployment Approvals — the human gate layered on top of the technical gates
Phase 10 work in the project memory tracker — broader security posture
Last updated