Terraform LXC: promote batch PRD to canonical, Phase 2 validated

- terraform-lxc-deployment.md -> terraform-lxc-deployment-batch.md
- Phase 2 validated at N=4 and N=7 on MK33 (pve-swarm)
- All dynamic derivation rules tested and confirmed
- Runtime behavior notes: auto.tfvars vs TF_VAR_*, -auto-approve, PVE race conditions
This commit is contained in:
F.R.I.D.A.Y.
2026-06-05 08:38:02 -04:00
parent 520da27cd3
commit ff60037860

View File

@@ -1,153 +0,0 @@
# Terraform LXC Deployment for Iron Legion — PRD
**Status:** Phase 1 Complete | **Author:** Artemis | **Date:** 2026-06-04
> **Phase 1 validation:** Single LXC plan/build/destroy completed successfully on MK33 (pve-swarm). All open questions resolved. Phase 2 pending.
## 1. Objective
Deploy Proxmox LXC containers via Terraform using the `bpg/proxmox` provider, running inside a custom Docker container (lazy automator pattern). Support runtime parameterization for bulk LXC creation with auto-incrementing VMID, IPv4, and naming.
## 2. Architecture
### 2.1 Docker Image
**Base:** `hashicorp/terraform:latest` with `bpg/proxmox` provider downloaded at container init
**Provider:** `bpg/proxmox` v0.70.0
**Pattern:** Lazy automator — local workspace mounted into container, credentials via `terraform.auto.tfvars`
```dockerfile
FROM hashicorp/terraform:latest
WORKDIR /workspace
COPY run.sh /usr/local/bin/run
RUN chmod +x /usr/local/bin/run
ENTRYPOINT ["bash"]
```
### 2.2 Credential Model
Native Terraform variable loading via `terraform.auto.tfvars` (no Docker env-file mapping):
```hcl
# terraform/terraform.auto.tfvars
pm_api_url = "https://192.168.7.33:8006/api2/json"
pm_api_token_id = "root@pam!terraform"
pm_api_token_secret = "<secret>"
```
PVE API token created on MK33: `root@pam!terraform`. Token stored in fleet credential store.
### 2.3 Runtime Parameterization (Phase 2)
| Parameter | Example | Effect |
|-----------|---------|--------|
| `count` | `4` | Number of LXCs to create |
| `vmid_base` | `5050` | Starting VMID |
Auto-derived per LXC (index `i` from 0 to `count-1`):
- **VMID:** `vmid_base + i`
- **Name:** `lxc-${vmid}`
- **IPv4:** `192.168.${first2digits(vmid)}.${last2digits(vmid)}/18`
### 2.4 LXC Configuration (Validated)
- **OS:** Debian 12 (`debian-12-standard_12.2-1_amd64.tar.zst`)
- **CPU:** 1 vCPU
- **RAM:** 2048 MB
- **Storage:** 8GB rootfs on `local` directory (test phase)
- **Network:** Static IPv4, gateway `192.168.18.1`, subnet `/18`
- **DNS:** `192.168.7.7`, `192.168.18.1`, `1.1.1.1`
- **Privilege:** Unprivileged (`unprivileged = true`)
- **Features:** Nesting enabled (`features { nesting = true }`)
### 2.5 User / SSH (Tested)
```hcl
initialization {
user_account {
username = "jarvis"
password = "<fleet_linux_pass>" # Required for console login verification
keys = [file("artemis_key.pub")]
}
}
```
## 3. Phase Breakdown
### Phase 1 — Single LXC (Plan/Build/Destroy) ✅ COMPLETE
**Completed:** 2026-06-04 on MK33 (pve-swarm, cluster node 33)
**Results:**
- `Dockerfile` — simplified to official `hashicorp/terraform:latest` image
- `docker-compose.yml` — workspace mount, no env-file credential mapping
- `run.sh` — wrapper for `terraform plan/apply/destroy`
- `terraform/providers.tf``bpg/proxmox` v0.70.0
- `terraform/main.tf` — single LXC resource (VMID 5050)
- `terraform/terraform.auto.tfvars` — native Terraform credential loading
**Validated:**
```bash
./run.sh plan # ✅ Validated
./run.sh apply # ✅ Created lxc-5050 (debian-12, 192.168.50.50/18)
./run.sh destroy # ✅ Clean teardown
```
**Key fixes discovered during testing:**
- Storage pool: `local-lvm` missing → used `local` (Directory)
- Template path: `nas-ct-stor:vztmpl/` (NFS shared templates)
- Unprivileged required: `unprivileged = true` + `features { nesting = true }`
- Password injection: `user_account.password` required for console login verification
### Phase 2 — Modular + Bulk Creation
**Goal:** Add `count`, `vmid_base`, and auto-derived naming/IP.
**Deliverables:**
- `modules/lxc/` — reusable LXC module
- `locals.tf` — VMID/IP/name calculation logic
- `main.tf` — uses module with `count = var.lxc_count`
**Example execution:**
```bash
TF_VAR_lxc_count=4 TF_VAR_vmid_base=5050 ./run.sh apply
# Creates: lxc-5050, lxc-5051, lxc-5052, lxc-5053
```
## 4. File Structure
```
~/docker/terraform-pve/
├── Dockerfile
├── docker-compose.yml
├── run.sh
├── terraform/
│ ├── .terraform/
│ ├── main.tf
│ ├── providers.tf
│ ├── terraform.auto.tfvars # Credentials (not committed)
│ ├── terraform.tfstate
│ ├── variables.tf
│ └── artemis_key.pub
```
## 5. Resolved Decisions
| Decision | Chosen | Notes |
|----------|--------|-------|
| Debian template | **12** | `debian-12-standard_12.2-1_amd64.tar.zst` on `nas-ct-stor` |
| Gateway | **192.168.18.1** | Router IP for 192.168.0.0/18 subnet |
| DNS | **192.168.7.7, 192.168.18.1, 1.1.1.1** | Technitium primary + fallback |
| SSH key | **artemis_key.pub** | Already registered fleet-wide |
| Storage (Phase 1) | **local** | `local-lvm` missing on nodes; migrate to `truenas-nfs` in Phase 2 |
| Privilege | **Unprivileged** | `unprivileged = true` with `nesting = true` for systemd 252 |
| Credential loading | **terraform.auto.tfvars** | Native Terraform pattern; no Docker env-file complexity |
## 6. Fleet Notes
- PVE API token: `root@pam!terraform` (Secret: fleet credential store)
- PVE root password: `proxmox12` (fleet credential store)
- Cluster: `pve-swarm` (MK33, MK34, MK39)
- Template storage: `nas-ct-stor` (NFS from TrueNAS)
- Disk storage (test): `local`
- **Code location:** `~/docker/terraform-pve/` — local only, not in any Gitea repo