> For the complete documentation index, see [llms.txt](https://whitepaper.litho.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://whitepaper.litho.ai/docs/validators/upgrade-procedures.md).

# Upgrade Procedures

> **Work in Progress** -- This page is under active development. Detailed step-by-step instructions for each upgrade path will be added as they are finalized.

***

## Topics to Cover

### Upgrade Methods

* **Cosmovisor-based Upgrades** -- Automated binary upgrades triggered by on-chain governance proposals. Cosmovisor monitors the chain for upgrade plans and automatically swaps the binary at the specified block height.
* **Binary Replacement Upgrades** -- Manual upgrade procedure involving stopping the node, replacing the `lithod` binary, and restarting. Suitable for non-governance upgrades or emergency patches.
* **Governance Proposal Upgrades** -- End-to-end workflow for proposing, voting on, and executing chain upgrades through the on-chain governance module.
* **Rollback Procedures** -- Steps to revert to a previous binary version and chain state in the event of a failed upgrade.

***

## Environment Promotion

All changes follow a strict promotion path through environments before reaching mainnet:

```
Devnet  -->  Staging  -->  Mainnet
```

Each environment serves as a validation gate. Changes must pass all tests, health checks, and approval requirements at each stage before promotion to the next.

***

## Mainnet Deployment Gate

The mainnet deployment process follows a controlled, auditable pipeline:

1. **Git Tag** -- A release is tagged in the source repository (e.g., `v1.2.0`).
2. **CI Build** -- The tagged commit triggers a CI pipeline that builds, tests, and signs the release artifacts.
3. **Approval** -- A designated CODEOWNER must explicitly approve the deployment.
4. **ArgoCD Sync** -- Upon approval, ArgoCD synchronizes the new version to the mainnet environment.

***

## Rollback Procedures

### Application Rollback (current flow)

> Note: This page previously documented an ArgoCD-based rollback. The ArgoCD/Kubernetes deployment was never wired into Makalu — production runs on EC2 + Docker Compose. Updated 2026-05-12 to match reality.

* **Automatic** -- `.github/workflows/deploy-simple.yaml` has a `rollback` job that runs on `failure()` and restores the previous `docker-compose.yaml` + `.env` snapshot taken at deploy start. No operator intervention needed when CI itself detects deploy failure.
* **Manual** -- Operators can re-run a previous green deploy by triggering `deploy-simple.yaml` via `workflow_dispatch` and pinning the `tag` input to the prior successful image SHA. The post-deploy SHA-verify step confirms the rollback landed.

### Database Rollback

* **Point-in-Time Recovery (PITR)** -- Database state can be restored to any point in time within the retention window. This is critical for recovering from data corruption or unintended state changes introduced by a faulty upgrade.

***

## Deployment Windows

Deployments are restricted to defined maintenance windows to minimize risk and ensure adequate staffing for monitoring:

| Environment | Allowed Days        | Allowed Hours (UTC) |
| ----------- | ------------------- | ------------------- |
| **Devnet**  | Any day             | Any time            |
| **Staging** | Monday -- Thursday  | 09:00 -- 17:00      |
| **Mainnet** | Tuesday -- Thursday | 14:00 -- 16:00      |

***

## Freeze Periods

The following freeze periods restrict mainnet deployments to reduce risk during high-impact windows:

| Rule                 | Detail                                                                                                                                       |
| -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| **Major events**     | No mainnet deployments during major network events, governance votes, or external market events that could amplify the impact of any issues. |
| **Minimum interval** | A minimum of 48 hours must pass between successive mainnet deployments to allow adequate observation of each change.                         |
| **Holiday freeze**   | No mainnet deployments between December 20 and January 5 (inclusive) to account for reduced staffing during the holiday period.              |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://whitepaper.litho.ai/docs/validators/upgrade-procedures.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
