diff --git a/PRD Drafts/git-repo-setup-peer-review.md b/PRD Drafts/git-repo-setup-peer-review.md index 3e82af3..1bc7688 100644 --- a/PRD Drafts/git-repo-setup-peer-review.md +++ b/PRD Drafts/git-repo-setup-peer-review.md @@ -136,10 +136,14 @@ Use this for every new repo: ## 8. Fleet Credential Store Update -Added to `~/.hermes/credentials/fleet.env`: +> ⚠️ **Status:** Tokens documented here are **EXPIRED / REVOKED** (confirmed 2026-06-05 via 401 on Gitea API). +> **Action required:** Generate new tokens via Gitea UI → User Settings → Applications → Generate New Token. +> **Updated token values should be written to `~/.ansible/secrets/deploy_token` and `~/.hermes/credentials/fleet.env`.** + +Original values (for reference — **DO NOT USE**): ``` -GITEA_DEPLOY_TOKEN=226c3ef38eb35914ae6b647803c2e597f66f28cb -GITEA_RW_TOKEN=968e86d51ab9b6b2a3eb5e97b391ce8c6534ec2d +GITEA_DEPLOY_TOKEN=226c3ef38eb35914ae6b647803c2e597f66f28cb # EXPIRED +GITEA_RW_TOKEN=968e86d51ab9b6b2a3eb5e97b391ce8c6534ec2d # EXPIRED ``` -Source of truth remains `/home/jarvis/.ansible/secrets/deploy_token`. +Source of truth: `/home/jarvis/.ansible/secrets/deploy_token` (must be updated with new tokens). diff --git a/PRD Drafts/terraform-lxc-deployment.md b/PRD Drafts/terraform-lxc-deployment.md deleted file mode 100644 index 3033559..0000000 --- a/PRD Drafts/terraform-lxc-deployment.md +++ /dev/null @@ -1,156 +0,0 @@ -# Terraform LXC Deployment for Iron Legion — PRD - -**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-04 - -## 1. Objective - -Deploy Proxmox LXC containers via Terraform using the `bpg/proxmox` provider, running inside a custom Docker container (lazy automator pattern). Support runtime parameterization for bulk LXC creation with auto-incrementing VMID, IPv4, and naming. - -## 2. Architecture - -### 2.1 Docker Image - -**Base:** Custom Dockerfile extending `hashicorp/terraform:latest` -**Provider:** `bpg/proxmox` pre-installed via `terraform init` at build time -**Pattern:** Matches thelazyautomator's guide — local workspace mounted into container - -```dockerfile -FROM hashicorp/terraform:latest -# Pre-install bpg/proxmox provider cache -COPY providers.tf /tmp/providers.tf -RUN cd /tmp && terraform init -upgrade && rm -f providers.tf -WORKDIR /workspace -ENTRYPOINT ["terraform"] -``` - -### 2.2 Credential Model - -Proxmox API token stored in `.env` / `terraform.tfvars`, referenced as variables: - -```hcl -variable "pm_api_url" { - default = "https://192.168.7.33:8006/api2/json" -} - -variable "pm_api_token_id" { - default = "root@pam!terraform" -} - -variable "pm_api_token_secret" { - default = "terraform" -} -``` - -Token to be created on MK33: `pveum user token add root@pam terraform --comment "Terraform automation" --privsep 0` - -### 2.3 Runtime Parameterization - -| Parameter | Example | Effect | -|-----------|---------|--------| -| `count` | `4` | Number of LXCs to create | -| `vmid_base` | `5050` | Starting VMID | - -Auto-derived per LXC (index `i` from 0 to `count-1`): -- **VMID:** `vmid_base + i` -- **Name:** `lxc-${vmid}` -- **IPv4:** `192.168.${first2digits(vmid)}.${last2digits(vmid)}/18` - - Example: vmid 5050 → `192.168.50.50/18` - - Example: vmid 5051 → `192.168.50.51/18` - -### 2.4 LXC Configuration (Static) - -- **OS:** Debian 13 (or Debian 12 if 13 unavailable) -- **CPU:** 1 vCPU, 2 cores -- **RAM:** 2048 MB -- **Storage:** 8GB rootfs on local disk (test), migrate to NFS after validation -- **Network:** Static IPv4 with gateway `192.168.0.1` - -### 2.5 User / SSH (Option A First) - -Bake `jarvis` user + SSH key into LXC via `initialization` block: - -```hcl -initialization { - user_account { - username = "jarvis" - keys = [file("~/.ssh/artemis_key.pub")] - } -} -``` - -**Fallback (B):** If initialization fails after 3 attempts, set root password to `ubuntu` via `root_password` and let Ansible configure post-build. - -## 3. Phase Breakdown - -### Phase 1 — Single LXC (Plan/Build/Destroy) - -**Goal:** Prove the pipeline works end-to-end with one manual LXC. - -**Deliverables:** -- `Dockerfile` for custom Terraform image -- `docker-compose.yml` for local execution -- `main.tf` — single LXC resource with hardcoded VMID -- `providers.tf` — bpg/proxmox provider config -- `variables.tf` — API credentials and defaults -- `run.sh` — wrapper script for plan/apply/destroy - -**Test:** -```bash -./run.sh plan # Validate config -./run.sh apply # Build lxc-5050 -./run.sh destroy # Clean up -``` - -### Phase 2 — Modular + Bulk Creation - -**Goal:** Add `count`, `vmid_base`, and auto-derived naming/IP. - -**Deliverables:** -- `modules/lxc/` — reusable LXC module -- `locals.tf` — VMID/IP/name calculation logic -- `main.tf` — uses module with `count = var.lxc_count` -- Step-counter for sequential VMID assignment - -**Example execution:** -```bash -TF_VAR_lxc_count=4 TF_VAR_vmid_base=5050 ./run.sh apply -# Creates: lxc-5050, lxc-5051, lxc-5052, lxc-5053 -``` - -## 4. File Structure - -``` -~/docker/terraform-pve/ -├── Dockerfile -├── docker-compose.yml -├── run.sh -├── terraform/ -│ ├── providers.tf -│ ├── variables.tf -│ ├── main.tf -│ ├── locals.tf -│ └── modules/ -│ └── lxc/ -│ ├── main.tf -│ ├── variables.tf -│ └── outputs.tf -``` - -## 5. Open Questions - -1. **Debian version:** Is Debian 13 available on your PVE nodes as a template, or should we use Debian 12? -2. **Gateway IP:** Confirm `192.168.0.1` is the correct gateway for `192.168.0.0/18` subnet? -3. **DNS servers:** Use Technitium (`192.168.7.7`) for LXC `/etc/resolv.conf`? -4. **SSH key:** Use `~/.ssh/artemis_key.pub` for jarvis user, or a dedicated terraform key? - -## 6. Decision Points - -| Decision | Option A | Option B | -|----------|----------|----------| -| Debian template | 13 (if available) | 12 (fallback) | -| DNS | Technitium (192.168.7.7) | Router default (192.168.18.1) | -| SSH key | artemis_key.pub | New dedicated terraform key | - ---- - -**Awaiting Commander Bobby approval before Phase 1 build.** \ No newline at end of file diff --git a/PRDs/terraform-lxc-deployment.md b/PRDs/terraform-lxc-deployment.md index 3033559..6aacb2e 100644 --- a/PRDs/terraform-lxc-deployment.md +++ b/PRDs/terraform-lxc-deployment.md @@ -1,6 +1,8 @@ # Terraform LXC Deployment for Iron Legion — PRD -**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-04 +**Status:** Phase 1 Complete | **Author:** Artemis | **Date:** 2026-06-04 + +> **Phase 1 validation:** Single LXC plan/build/destroy completed successfully on MK33 (pve-swarm). All open questions resolved. Phase 2 pending. ## 1. Objective @@ -10,40 +12,32 @@ Deploy Proxmox LXC containers via Terraform using the `bpg/proxmox` provider, ru ### 2.1 Docker Image -**Base:** Custom Dockerfile extending `hashicorp/terraform:latest` -**Provider:** `bpg/proxmox` pre-installed via `terraform init` at build time -**Pattern:** Matches thelazyautomator's guide — local workspace mounted into container +**Base:** `hashicorp/terraform:latest` with `bpg/proxmox` provider downloaded at container init +**Provider:** `bpg/proxmox` v0.70.0 +**Pattern:** Lazy automator — local workspace mounted into container, credentials via `terraform.auto.tfvars` ```dockerfile FROM hashicorp/terraform:latest -# Pre-install bpg/proxmox provider cache -COPY providers.tf /tmp/providers.tf -RUN cd /tmp && terraform init -upgrade && rm -f providers.tf WORKDIR /workspace -ENTRYPOINT ["terraform"] +COPY run.sh /usr/local/bin/run +RUN chmod +x /usr/local/bin/run +ENTRYPOINT ["bash"] ``` ### 2.2 Credential Model -Proxmox API token stored in `.env` / `terraform.tfvars`, referenced as variables: +Native Terraform variable loading via `terraform.auto.tfvars` (no Docker env-file mapping): ```hcl -variable "pm_api_url" { - default = "https://192.168.7.33:8006/api2/json" -} - -variable "pm_api_token_id" { - default = "root@pam!terraform" -} - -variable "pm_api_token_secret" { - default = "terraform" -} +# terraform/terraform.auto.tfvars +pm_api_url = "https://192.168.7.33:8006/api2/json" +pm_api_token_id = "root@pam!terraform" +pm_api_token_secret = "" ``` -Token to be created on MK33: `pveum user token add root@pam terraform --comment "Terraform automation" --privsep 0` +PVE API token created on MK33: `root@pam!terraform`. Token stored in fleet credential store. -### 2.3 Runtime Parameterization +### 2.3 Runtime Parameterization (Phase 2) | Parameter | Example | Effect | |-----------|---------|--------| @@ -54,53 +48,57 @@ Auto-derived per LXC (index `i` from 0 to `count-1`): - **VMID:** `vmid_base + i` - **Name:** `lxc-${vmid}` - **IPv4:** `192.168.${first2digits(vmid)}.${last2digits(vmid)}/18` - - Example: vmid 5050 → `192.168.50.50/18` - - Example: vmid 5051 → `192.168.50.51/18` -### 2.4 LXC Configuration (Static) +### 2.4 LXC Configuration (Validated) -- **OS:** Debian 13 (or Debian 12 if 13 unavailable) -- **CPU:** 1 vCPU, 2 cores +- **OS:** Debian 12 (`debian-12-standard_12.2-1_amd64.tar.zst`) +- **CPU:** 1 vCPU - **RAM:** 2048 MB -- **Storage:** 8GB rootfs on local disk (test), migrate to NFS after validation -- **Network:** Static IPv4 with gateway `192.168.0.1` +- **Storage:** 8GB rootfs on `local` directory (test phase) +- **Network:** Static IPv4, gateway `192.168.18.1`, subnet `/18` +- **DNS:** `192.168.7.7`, `192.168.18.1`, `1.1.1.1` +- **Privilege:** Unprivileged (`unprivileged = true`) +- **Features:** Nesting enabled (`features { nesting = true }`) -### 2.5 User / SSH (Option A First) - -Bake `jarvis` user + SSH key into LXC via `initialization` block: +### 2.5 User / SSH (Tested) ```hcl initialization { user_account { username = "jarvis" - keys = [file("~/.ssh/artemis_key.pub")] + password = "" # Required for console login verification + keys = [file("artemis_key.pub")] } } ``` -**Fallback (B):** If initialization fails after 3 attempts, set root password to `ubuntu` via `root_password` and let Ansible configure post-build. - ## 3. Phase Breakdown -### Phase 1 — Single LXC (Plan/Build/Destroy) +### Phase 1 — Single LXC (Plan/Build/Destroy) ✅ COMPLETE -**Goal:** Prove the pipeline works end-to-end with one manual LXC. +**Completed:** 2026-06-04 on MK33 (pve-swarm, cluster node 33) -**Deliverables:** -- `Dockerfile` for custom Terraform image -- `docker-compose.yml` for local execution -- `main.tf` — single LXC resource with hardcoded VMID -- `providers.tf` — bpg/proxmox provider config -- `variables.tf` — API credentials and defaults -- `run.sh` — wrapper script for plan/apply/destroy +**Results:** +- `Dockerfile` — simplified to official `hashicorp/terraform:latest` image +- `docker-compose.yml` — workspace mount, no env-file credential mapping +- `run.sh` — wrapper for `terraform plan/apply/destroy` +- `terraform/providers.tf` — `bpg/proxmox` v0.70.0 +- `terraform/main.tf` — single LXC resource (VMID 5050) +- `terraform/terraform.auto.tfvars` — native Terraform credential loading -**Test:** +**Validated:** ```bash -./run.sh plan # Validate config -./run.sh apply # Build lxc-5050 -./run.sh destroy # Clean up +./run.sh plan # ✅ Validated +./run.sh apply # ✅ Created lxc-5050 (debian-12, 192.168.50.50/18) +./run.sh destroy # ✅ Clean teardown ``` +**Key fixes discovered during testing:** +- Storage pool: `local-lvm` missing → used `local` (Directory) +- Template path: `nas-ct-stor:vztmpl/` (NFS shared templates) +- Unprivileged required: `unprivileged = true` + `features { nesting = true }` +- Password injection: `user_account.password` required for console login verification + ### Phase 2 — Modular + Bulk Creation **Goal:** Add `count`, `vmid_base`, and auto-derived naming/IP. @@ -109,7 +107,6 @@ initialization { - `modules/lxc/` — reusable LXC module - `locals.tf` — VMID/IP/name calculation logic - `main.tf` — uses module with `count = var.lxc_count` -- Step-counter for sequential VMID assignment **Example execution:** ```bash @@ -125,32 +122,32 @@ TF_VAR_lxc_count=4 TF_VAR_vmid_base=5050 ./run.sh apply ├── docker-compose.yml ├── run.sh ├── terraform/ -│ ├── providers.tf -│ ├── variables.tf +│ ├── .terraform/ │ ├── main.tf -│ ├── locals.tf -│ └── modules/ -│ └── lxc/ -│ ├── main.tf -│ ├── variables.tf -│ └── outputs.tf +│ ├── providers.tf +│ ├── terraform.auto.tfvars # Credentials (not committed) +│ ├── terraform.tfstate +│ ├── variables.tf +│ └── artemis_key.pub ``` -## 5. Open Questions +## 5. Resolved Decisions -1. **Debian version:** Is Debian 13 available on your PVE nodes as a template, or should we use Debian 12? -2. **Gateway IP:** Confirm `192.168.0.1` is the correct gateway for `192.168.0.0/18` subnet? -3. **DNS servers:** Use Technitium (`192.168.7.7`) for LXC `/etc/resolv.conf`? -4. **SSH key:** Use `~/.ssh/artemis_key.pub` for jarvis user, or a dedicated terraform key? +| Decision | Chosen | Notes | +|----------|--------|-------| +| Debian template | **12** | `debian-12-standard_12.2-1_amd64.tar.zst` on `nas-ct-stor` | +| Gateway | **192.168.18.1** | Router IP for 192.168.0.0/18 subnet | +| DNS | **192.168.7.7, 192.168.18.1, 1.1.1.1** | Technitium primary + fallback | +| SSH key | **artemis_key.pub** | Already registered fleet-wide | +| Storage (Phase 1) | **local** | `local-lvm` missing on nodes; migrate to `truenas-nfs` in Phase 2 | +| Privilege | **Unprivileged** | `unprivileged = true` with `nesting = true` for systemd 252 | +| Credential loading | **terraform.auto.tfvars** | Native Terraform pattern; no Docker env-file complexity | -## 6. Decision Points +## 6. Fleet Notes -| Decision | Option A | Option B | -|----------|----------|----------| -| Debian template | 13 (if available) | 12 (fallback) | -| DNS | Technitium (192.168.7.7) | Router default (192.168.18.1) | -| SSH key | artemis_key.pub | New dedicated terraform key | - ---- - -**Awaiting Commander Bobby approval before Phase 1 build.** \ No newline at end of file +- PVE API token: `root@pam!terraform` (Secret: fleet credential store) +- PVE root password: `proxmox12` (fleet credential store) +- Cluster: `pve-swarm` (MK33, MK34, MK39) +- Template storage: `nas-ct-stor` (NFS from TrueNAS) +- Disk storage (test): `local` +- SSH push path: `git@192.168.192.24:2222/Iron-Legion/terraform-pve.git` (direct Neo IP, `artemis_key`) \ No newline at end of file