Standard operating procedures for environment promotion and rollback on Lithosphere Makalu.
Copy LOCAL --> TESTNET (Makalu) --> MAINNET
auto on push manual via promote.yaml
deploy-simple.yaml GitHub Environment approval Staging is optional — promote directly from testnet to mainnet when staging is not configured.
Testnet (Automatic)
Testnet deploys automatically on every push to main:
CI passes (lint, build, test, gitleaks)
deploy-simple.yaml triggers automatically
SSH deploy via bastion to EC2
Health checks verify API and Explorer
No manual intervention required.
Mainnet (Manual with Approval)
Identify the image tag to promote:
Trigger promotion via GitHub Actions:
Go to Actions > "Promote" workflow
Select:
Image tag: the tag from step 1 (e.g. sha-abc1234 or testnet-20260307-143000)
Approval gate :
A reviewer listed in the mainnet GitHub Environment must approve
Configure reviewers: Settings > Environments > mainnet > Required reviewers
Reviewer verifies the image tag was tested on testnet
Deployment proceeds automatically after approval:
Images retagged with mainnet and mainnet-{timestamp}
SSH deploy to EC2 via bastion
Health checks run post-deploy
Automatic rollback on failure
Deployment Windows
Environment
Days
Hours (UTC)
Freeze periods: Dec 20 - Jan 5, during chain events, 48h minimum between mainnet deploys.
Rollback Procedures
Automatic Rollback
Both deploy-simple.yaml and deploy.yaml include automatic rollback:
A rollback snapshot is saved before each deploy (compose file, .env, image list)
If health checks fail after deploy, the rollback job restores the snapshot
The rollback job triggers automatically — no manual intervention needed
Manual Rollback (SSH)
If automatic rollback fails or you need to roll back outside of CI:
Manual Rollback (Git-based)
Roll back to a specific commit:
Mainnet Rollback Requirements
For mainnet rollbacks:
Notify the on-call engineer before starting
Execute the rollback (any method above)
Verify all health endpoints respond
File a post-incident review within 24h
Emergency Hotfix Fast-Path
For critical production issues:
Create a fix branch from main
Push and merge the fix PR (expedited review)
deploy-simple.yaml auto-deploys to testnet
Immediately trigger promote.yaml with:
skip_approval: true (emergency flag)
Image tag from the testnet deploy
Document the emergency in a post-incident review.
Blue/Green Deploy (Optional)
For zero-downtime deploys, use the blue/green script:
This:
Saves a rollback snapshot
Builds the new version as an isolated Docker Compose project
Health-checks the new version
Cuts over traffic only if healthy
Keeps the old version available for instant rollback
Health Check Endpoints
GitHub Environment Setup
To enable approval gates for mainnet:
Go to repo Settings > Environments
Create environment: mainnet
Add required reviewers (1-2 team leads)
Optionally add deployment branch rules (only main)
The promote.yaml workflow uses this environment for the approval gate
Similarly for staging if needed.
Troubleshooting
Deploy stuck / SSH timeout
Containers not starting