- terraform-lxc-deployment.md -> terraform-lxc-deployment-batch.md - Phase 2 validated at N=4 and N=7 on MK33 (pve-swarm) - All dynamic derivation rules tested and confirmed - Runtime behavior notes: auto.tfvars vs TF_VAR_*, -auto-approve, PVE race conditions
7.8 KiB
Terraform LXC Deployment — Batch/Dynamic Template PRD
Status: Batch POC Validated | Author: Artemis | Date: 2026-06-05
Goal: Dynamic LXC factory — one
terraform applycreates N containers with auto-derived VMID, IPv4, hostname, and naming from a single base input.
1. Objective
Extend the Phase 1 single-LXC proven pipeline into a parameterized batch generator. A single variable set (vmid_base, lxc_count, subnet_prefix) drives auto-incrementing VMIDs, auto-derived static IPv4s, and consistent hostnames — no per-container hardcoding.
2. Dynamic Derivation Rules
2.1 Input Variables (User-Supplied)
| Variable | Example | Description |
|---|---|---|
vmid_base |
5050 |
Starting VMID for first LXC |
lxc_count |
4 |
Number of LXCs to create |
subnet_prefix |
192.168 |
First two octets of IPv4 (fleet standard) |
name_prefix |
lxc |
Hostname prefix |
gateway |
192.168.18.1 |
Default gateway |
dns_servers |
["192.168.7.7", "1.1.1.1"] |
DNS list |
2.2 Auto-Derived Per-LXC (Index i from 0 to lxc_count-1)
| Property | Formula | Example (vmid_base=5050, i=2) |
|---|---|---|
| VMID | vmid_base + i |
5052 |
| IPv4 | subnet_prefix.${first2(vmid)}.${last2(vmid)}/18 |
192.168.50.52/18 |
| Hostname | ${name_prefix}-${vmid} |
lxc-5052 |
| Cores | Fixed | 2 |
| RAM | Fixed | 2048 MB |
| Disk | Fixed | 8 GB |
IP Derivation Detail:
vmid = 5052
first2(vmid) = 50 (digits 3-4)
last2(vmid) = 52 (digits 5-6)
IPv4 = 192.168.50.52/18
This keeps VMID and IPv4 tightly coupled — VMID is the single source of truth for IP assignment. All IPs fall within the fleet /18 subnet (192.168.0.0/18).
2.3 Example Runs
# Create 4 LXCs: lxc-5050 → lxc-5053
# IPs: 192.168.50.50 → 192.168.50.53
TF_VAR_vmid_base=5050 TF_VAR_lxc_count=4 ./run.sh apply -auto-approve
# Create 2 LXCs starting at 5100
# IPs: 192.168.51.00, 192.168.51.01
TF_VAR_vmid_base=5100 TF_VAR_lxc_count=2 ./run.sh apply -auto-approve
# Create 7 LXCs at vmid_base=931 (validated POC run)
TF_VAR_vmid_base=931 TF_VAR_lxc_count=7 ./run.sh apply -auto-approve
2. Architecture
2.1 Docker Image
Base: hashicorp/terraform:latest with bpg/proxmox provider downloaded at container init
Provider: bpg/proxmox v0.70.0
Pattern: Lazy automator — local workspace mounted into container, credentials via terraform.auto.tfvars
FROM hashicorp/terraform:latest
WORKDIR /workspace
COPY run.sh /usr/local/bin/run
RUN chmod +x /usr/local/bin/run
ENTRYPOINT ["bash"]
2.2 Credential Model
Native Terraform variable loading via terraform.auto.tfvars (no Docker env-file mapping):
# terraform/terraform.auto.tfvars
pm_api_url = "https://192.168.7.33:8006/api2/json"
pm_api_token_id = "root@pam!terraform"
pm_api_token_secret = "<secret>"
PVE API token created on MK33: root@pam!terraform. Token stored in fleet credential store.
2.3 Runtime Parameterization (Phase 2)
| Parameter | Example | Effect |
|---|---|---|
count |
4 |
Number of LXCs to create |
vmid_base |
5050 |
Starting VMID |
Auto-derived per LXC (index i from 0 to count-1):
- VMID:
vmid_base + i - Name:
lxc-${vmid} - IPv4:
192.168.${first2digits(vmid)}.${last2digits(vmid)}/18
2.4 LXC Configuration (Validated)
- OS: Debian 12 (
debian-12-standard_12.2-1_amd64.tar.zst) - CPU: 1 vCPU
- RAM: 2048 MB
- Storage: 8GB rootfs on
localdirectory (test phase) - Network: Static IPv4, gateway
192.168.18.1, subnet/18 - DNS:
192.168.7.7,192.168.18.1,1.1.1.1 - Privilege: Unprivileged (
unprivileged = true) - Features: Nesting enabled (
features { nesting = true })
2.5 User / SSH (Tested)
initialization {
user_account {
username = "jarvis"
password = "<fleet_linux_pass>" # Required for console login verification
keys = [file("artemis_key.pub")]
}
}
3. Phase Breakdown
Phase 1 — Single LXC (Plan/Build/Destroy) ✅ COMPLETE
Completed: 2026-06-04 on MK33 (pve-swarm, cluster node 33)
Results:
Dockerfile— simplified to officialhashicorp/terraform:latestimagedocker-compose.yml— workspace mount, no env-file credential mappingrun.sh— wrapper forterraform plan/apply/destroyterraform/providers.tf—bpg/proxmoxv0.70.0terraform/main.tf— single LXC resource (VMID 5050)terraform/terraform.auto.tfvars— native Terraform credential loading
Validated:
./run.sh plan # ✅ Validated
./run.sh apply # ✅ Created lxc-5050 (debian-12, 192.168.50.50/18)
./run.sh destroy # ✅ Clean teardown
Key fixes discovered during testing:
- Storage pool:
local-lvmmissing → usedlocal(Directory) - Template path:
nas-ct-stor:vztmpl/(NFS shared templates) - Unprivileged required:
unprivileged = true+features { nesting = true } - Password injection:
user_account.passwordrequired for console login verification
Phase 2 — Modular + Bulk Creation ✅ VALIDATED
Completed: 2026-06-05 on MK33 (pve-swarm)
Results:
modules/lxc/— reusable LXC module withproxmox_virtual_environment_containerresourcemain.tf—for_eachover module withlxc_countparameterizationrun.sh— forwardsTF_VAR_*environment variables into Docker container
Validated at multiple scales:
| Test | Command | Result |
|---|---|---|
| 4 LXCs at vmid_base=3550 | TF_VAR_lxc_count=4 TF_VAR_vmid_base=3550 ./run.sh apply |
✅ All created; 1 transient 500 error on start (PVE task queue race), container existed and operational despite error |
| 7 LXCs at vmid_base=931 | TF_VAR_lxc_count=7 TF_VAR_vmid_base=931 ./run.sh apply |
✅ All 7 created successfully, no errors, ~14–16s per container |
| 7 LXCs destroy | ./run.sh destroy -auto-approve |
✅ All 7 destroyed cleanly in ~8s each |
Key runtime behavior discovered:
terraform.auto.tfvarsoutranksTF_VAR_*environment variables — dynamic variables must not be set in.tfvars-auto-approverequired on Dockerized terraform (no interactive TTY for confirmation)- Parallel creation (default) works at N=7; transient race condition observed at N=4 (PVE task queue, not terraform logic)
- All containers receive SSH key + password via
initialization.user_accountblock
4. File Structure
~/docker/terraform-pve/
├── Dockerfile
├── docker-compose.yml
├── run.sh
├── terraform/
│ ├── .terraform/
│ ├── main.tf
│ ├── providers.tf
│ ├── terraform.auto.tfvars # Credentials (not committed)
│ ├── terraform.tfstate
│ ├── variables.tf
│ └── artemis_key.pub
5. Resolved Decisions
| Decision | Chosen | Notes |
|---|---|---|
| Debian template | 12 | debian-12-standard_12.2-1_amd64.tar.zst on nas-ct-stor |
| Gateway | 192.168.18.1 | Router IP for 192.168.0.0/18 subnet |
| DNS | 192.168.7.7, 192.168.18.1, 1.1.1.1 | Technitium primary + fallback |
| SSH key | artemis_key.pub | Already registered fleet-wide |
| Storage (Phase 1) | local | local-lvm missing on nodes; migrate to truenas-nfs in Phase 2 |
| Privilege | Unprivileged | unprivileged = true with nesting = true for systemd 252 |
| Credential loading | terraform.auto.tfvars | Native Terraform pattern; no Docker env-file complexity |
6. Fleet Notes
- PVE API token:
root@pam!terraform(Secret: fleet credential store) - PVE root password:
proxmox12(fleet credential store) - Cluster:
pve-swarm(MK33, MK34, MK39) - Template storage:
nas-ct-stor(NFS from TrueNAS) - Disk storage (test):
local - Code location:
~/docker/terraform-pve/— local only, not in any Gitea repo