Terraform LXC: promote batch PRD to canonical, Phase 2 validated

- terraform-lxc-deployment.md -> terraform-lxc-deployment-batch.md
- Phase 2 validated at N=4 and N=7 on MK33 (pve-swarm)
- All dynamic derivation rules tested and confirmed
- Runtime behavior notes: auto.tfvars vs TF_VAR_*, -auto-approve, PVE race conditions
This commit is contained in:
F.R.I.D.A.Y.
2026-06-05 08:38:02 -04:00
parent 520da27cd3
commit ff60037860

View File

@@ -1,12 +1,61 @@
# Terraform LXC Deployment for Iron Legion — PRD # Terraform LXC Deployment — Batch/Dynamic Template PRD
**Status:** Phase 1 Complete | **Author:** Artemis | **Date:** 2026-06-04 **Status:** Batch POC Validated | **Author:** Artemis | **Date:** 2026-06-05
> **Phase 1 validation:** Single LXC plan/build/destroy completed successfully on MK33 (pve-swarm). All open questions resolved. Phase 2 pending. > **Goal:** Dynamic LXC factory — one `terraform apply` creates N containers with auto-derived VMID, IPv4, hostname, and naming from a single base input.
## 1. Objective ## 1. Objective
Deploy Proxmox LXC containers via Terraform using the `bpg/proxmox` provider, running inside a custom Docker container (lazy automator pattern). Support runtime parameterization for bulk LXC creation with auto-incrementing VMID, IPv4, and naming. Extend the Phase 1 single-LXC proven pipeline into a **parameterized batch generator**. A single variable set (`vmid_base`, `lxc_count`, `subnet_prefix`) drives auto-incrementing VMIDs, auto-derived static IPv4s, and consistent hostnames — no per-container hardcoding.
## 2. Dynamic Derivation Rules
### 2.1 Input Variables (User-Supplied)
| Variable | Example | Description |
|----------|---------|-------------|
| `vmid_base` | `5050` | Starting VMID for first LXC |
| `lxc_count` | `4` | Number of LXCs to create |
| `subnet_prefix` | `192.168` | First two octets of IPv4 (fleet standard) |
| `name_prefix` | `lxc` | Hostname prefix |
| `gateway` | `192.168.18.1` | Default gateway |
| `dns_servers` | `["192.168.7.7", "1.1.1.1"]` | DNS list |
### 2.2 Auto-Derived Per-LXC (Index `i` from `0` to `lxc_count-1`)
| Property | Formula | Example (`vmid_base=5050`, `i=2`) |
|----------|---------|----------------------------------|
| **VMID** | `vmid_base + i` | `5052` |
| **IPv4** | `subnet_prefix.${first2(vmid)}.${last2(vmid)}/18` | `192.168.50.52/18` |
| **Hostname** | `${name_prefix}-${vmid}` | `lxc-5052` |
| **Cores** | Fixed | `2` |
| **RAM** | Fixed | `2048` MB |
| **Disk** | Fixed | `8` GB |
**IP Derivation Detail:**
```
vmid = 5052
first2(vmid) = 50 (digits 3-4)
last2(vmid) = 52 (digits 5-6)
IPv4 = 192.168.50.52/18
```
This keeps VMID and IPv4 tightly coupled — **VMID is the single source of truth** for IP assignment. All IPs fall within the fleet `/18` subnet (`192.168.0.0/18`).
### 2.3 Example Runs
```bash
# Create 4 LXCs: lxc-5050 → lxc-5053
# IPs: 192.168.50.50 → 192.168.50.53
TF_VAR_vmid_base=5050 TF_VAR_lxc_count=4 ./run.sh apply -auto-approve
# Create 2 LXCs starting at 5100
# IPs: 192.168.51.00, 192.168.51.01
TF_VAR_vmid_base=5100 TF_VAR_lxc_count=2 ./run.sh apply -auto-approve
# Create 7 LXCs at vmid_base=931 (validated POC run)
TF_VAR_vmid_base=931 TF_VAR_lxc_count=7 ./run.sh apply -auto-approve
```
## 2. Architecture ## 2. Architecture
@@ -99,20 +148,28 @@ initialization {
- Unprivileged required: `unprivileged = true` + `features { nesting = true }` - Unprivileged required: `unprivileged = true` + `features { nesting = true }`
- Password injection: `user_account.password` required for console login verification - Password injection: `user_account.password` required for console login verification
### Phase 2 — Modular + Bulk Creation ### Phase 2 — Modular + Bulk Creation ✅ VALIDATED
**Goal:** Add `count`, `vmid_base`, and auto-derived naming/IP. **Completed:** 2026-06-05 on MK33 (pve-swarm)
**Deliverables:** **Results:**
- `modules/lxc/` — reusable LXC module - `modules/lxc/` — reusable LXC module with `proxmox_virtual_environment_container` resource
- `locals.tf`VMID/IP/name calculation logic - `main.tf``for_each` over module with `lxc_count` parameterization
- `main.tf` — uses module with `count = var.lxc_count` - `run.sh` — forwards `TF_VAR_*` environment variables into Docker container
**Example execution:** **Validated at multiple scales:**
```bash
TF_VAR_lxc_count=4 TF_VAR_vmid_base=5050 ./run.sh apply | Test | Command | Result |
# Creates: lxc-5050, lxc-5051, lxc-5052, lxc-5053 |------|---------|--------|
``` | 4 LXCs at vmid_base=3550 | `TF_VAR_lxc_count=4 TF_VAR_vmid_base=3550 ./run.sh apply` | ✅ All created; 1 transient 500 error on start (PVE task queue race), container existed and operational despite error |
| 7 LXCs at vmid_base=931 | `TF_VAR_lxc_count=7 TF_VAR_vmid_base=931 ./run.sh apply` | ✅ All 7 created successfully, no errors, ~1416s per container |
| 7 LXCs destroy | `./run.sh destroy -auto-approve` | ✅ All 7 destroyed cleanly in ~8s each |
**Key runtime behavior discovered:**
- `terraform.auto.tfvars` outranks `TF_VAR_*` environment variables — dynamic variables must **not** be set in `.tfvars`
- `-auto-approve` required on Dockerized terraform (no interactive TTY for confirmation)
- Parallel creation (default) works at N=7; transient race condition observed at N=4 (PVE task queue, not terraform logic)
- All containers receive SSH key + password via `initialization.user_account` block
## 4. File Structure ## 4. File Structure