Files
documentation/PRD Drafts/terraform-proxmox-lxc-automation.md
F.R.I.D.A.Y. 4377ffaffa Add PRD: Terraform LXC Automation for Proxmox VE 9.2
New directories:
- PRD Drafts/      — Active PRDs pending review
- PRD archived/    — Approved/archived PRDs

Adds terraform-proxmox-lxc-automation.md:
- Provider: bpg/proxmox (actively maintained, 11M+ downloads)
- Scope: LXC creation, networking, storage, auth patterns
- Includes complete sample project tree with working HCL
- Covers API token, cloud-init, DHCP/static IP, mount points
- State backend + CI/CD integration guidance

Author: F.R.I.D.A.Y.
Date: 2026-06-01
2026-06-01 14:48:14 -04:00

564 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PRD: Terraform LXC Automation for Proxmox VE 9.2
**Status:** Draft — Pending Commander Bobby Review
**Author:** F.R.I.D.A.Y.
**Date:** 2026-06-01
**Provider:** `bpg/proxmox` (actively maintained, 11M+ downloads)
**Target:** Proxmox VE 9.2 / Debian Trixie
---
## 1. Purpose & Scope
This PRD defines the architecture, configuration patterns, and operational workflow for automating LXC container lifecycle management on Proxmox VE 9.2 clusters using Terraform and the actively maintained `bpg/proxmox` provider.
**In scope:**
- Terraform provider configuration and authentication
- LXC resource definitions (`proxmox_virtual_environment_container`)
- Cloud-init / template-based provisioning
- Network configuration (static IP, DHCP, bridge)
- Storage allocation (rootfs on any PVE backend)
- State management and CI/CD integration patterns
**Out of scope:**
- VM (QEMU/KVM) provisioning
- PVE cluster topology changes
- Backup/restore automation (separate PRD)
---
## 2. Success Criteria
| # | Criterion | How Verified |
|---|-----------|-------------|
| 1 | A single `terraform apply` creates a working LXC with SSH access | `ssh root@<lxc-ip>` succeeds |
| 2 | LXCs are provisioned from official cloud-image templates | Template downloaded via `proxmox_virtual_environment_download_file` |
| 3 | Network is configurable per-LXC (DHCP or static CIDR) | `ip addr` inside container matches TF config |
| 4 | Rootfs lives on user-selected storage (not hardcoded to `local-lvm`) | `pvesm status` shows volume on target datastore |
| 5 | State is stored remotely (S3-compatible or Terraform Cloud) | `terraform state list` works from any machine |
| 6 | Destroy and recreate is idempotent | `terraform destroy && terraform apply` yields identical result |
---
## 3. Provider Selection
### Why `bpg/proxmox` (not `telmate/proxmox`)
| Provider | Maintenance | Downloads | LXC Support | Notes |
|----------|-------------|-----------|-------------|-------|
| `bpg/proxmox` | ✅ Active (v0.108.0, June 2026) | 11.8M+ | Full | Community-tier, comprehensive docs, supports PVE 9.x |
| `telmate/proxmox` | ❌ Stale (last release ~2023) | Legacy | Partial | Deprecated; lacks PVE 9.x features |
**Decision:** Use `bpg/proxmox` exclusively. The `telmate` provider is unmaintained and incompatible with PVE 9.2 API changes.
**Provider block (minimum):**
```hcl
terraform {
required_providers {
proxmox = {
source = "bpg/proxmox"
version = "~> 0.108"
}
}
}
provider "proxmox" {
endpoint = "https://192.168.7.7:8006/"
username = "root@pam"
password = var.proxmox_password # or PROXMOX_VE_PASSWORD env var
insecure = true # self-signed TLS
}
```
---
## 4. Authentication Matrix
| Method | Use Case | Config | Security |
|--------|----------|--------|----------|
| **API Token** | Production, CI/CD | `api_token = "root@pam!mytoken=abc123…"` | Highest — revocable, fine-grained |
| **Username/Password** | Development, one-offs | `username = "root@pam"`, `password = "…"` | Medium — password in env |
| **Auth Ticket** | TOTP-enabled accounts | Pre-authenticate, pass ticket | High — short-lived |
**Recommendation for Iron Legion:**
- **Development:** Use `PROXMOX_VE_PASSWORD` environment variable
- **CI/CD (future):** Create a PVE API token with `PVEFarmAdmin` or custom role, store in CI secrets
---
## 5. Sample Project Structure
```
terraform-proxmox-lxc/
├── README.md
├── main.tf # Provider + backend config
├── variables.tf # Input variables
├── terraform.tfvars.example # Sample values (gitignored)
├── outputs.tf # Useful outputs (IPs, IDs)
├── versions.tf # Required providers + TF version
├── modules/
│ └── lxc/
│ ├── main.tf # proxmox_virtual_environment_container resource
│ ├── variables.tf # Module inputs
│ └── outputs.tf # Module outputs
├── environments/
│ ├── dev/
│ │ ├── main.tf # Calls modules with dev vars
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ └── terraform.tfvars
└── templates/
└── ubuntu-25.04-cloudimg.yaml # Cloud-init user-data (optional)
```
### Key Files
#### `versions.tf`
```hcl
terraform {
required_version = ">= 1.5.0"
required_providers {
proxmox = {
source = "bpg/proxmox"
version = "~> 0.108"
}
random = {
source = "hashicorp/random"
version = "~> 3.6"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
# Remote state — S3-compatible (Minio, Garage, AWS S3)
backend "s3" {
bucket = "iron-legion-terraform"
key = "proxmox-lxc/terraform.tfstate"
region = "us-east-1"
endpoint = "https://s3.nb.bobbysh.me"
use_path_style = true
# Skip AWS-specific validations for self-hosted S3
skip_credentials_validation = true
skip_metadata_api_check = true
skip_region_validation = true
skip_requesting_account_id = true
}
}
```
#### `variables.tf`
```hcl
variable "proxmox_endpoint" {
description = "PVE API URL"
type = string
default = "https://192.168.7.7:8006/"
}
variable "proxmox_node" {
description = "Target PVE node name"
type = string
default = "mk7"
}
variable "ssh_public_key" {
description = "SSH public key for root access"
type = string
}
variable "lxc_configs" {
description = "Map of LXC configurations"
type = map(object({
vm_id = number
hostname = string
cores = optional(number, 2)
memory = optional(number, 2048)
disk_size = optional(number, 8)
datastore_id = optional(string, "local-lvm")
ip_address = optional(string, "dhcp")
gateway = optional(string, null)
template_url = optional(string, "https://mirrors.servercentral.com/ubuntu-cloud-images/releases/25.04/release/ubuntu-25.04-server-cloudimg-amd64-root.tar.xz")
features = optional(object({
nesting = optional(bool, true)
fuse = optional(bool, false)
keyctl = optional(bool, false)
}), {})
}))
}
```
#### `modules/lxc/main.tf`
```hcl
resource "proxmox_virtual_environment_download_file" "lxc_template" {
for_each = var.lxc_configs
content_type = "vztmpl"
datastore_id = "local"
node_name = var.proxmox_node
url = each.value.template_url
file_name = "${each.key}-template.tar.xz"
overwrite = false
}
resource "proxmox_virtual_environment_container" "lxc" {
for_each = var.lxc_configs
node_name = var.proxmox_node
vm_id = each.value.vm_id
description = "Managed by Terraform — ${each.key}"
unprivileged = true
features {
nesting = each.value.features.nesting
fuse = each.value.features.fuse
keyctl = each.value.features.keyctl
}
cpu {
cores = each.value.cores
units = 1024
}
memory {
dedicated = each.value.memory
swap = 0
}
disk {
datastore_id = each.value.datastore_id
size = each.value.disk_size
}
initialization {
hostname = each.value.hostname
ip_config {
ipv4 {
address = each.value.ip_address
gateway = each.value.gateway
}
}
user_account {
keys = [var.ssh_public_key]
password = random_password.lxc_root[each.key].result
}
}
network_interface {
name = "veth0"
bridge = "vmbr0"
}
operating_system {
template_file_id = proxmox_virtual_environment_download_file.lxc_template[each.key].id
type = "ubuntu"
}
startup {
order = "3"
up_delay = "60"
down_delay = "60"
}
depends_on = [proxmox_virtual_environment_download_file.lxc_template]
}
resource "random_password" "lxc_root" {
for_each = var.lxc_configs
length = 16
special = true
override_special = "_%@"
}
```
#### `modules/lxc/variables.tf`
```hcl
variable "proxmox_node" {
type = string
}
variable "ssh_public_key" {
type = string
}
variable "lxc_configs" {
type = map(object({
vm_id = number
hostname = string
cores = optional(number, 2)
memory = optional(number, 2048)
disk_size = optional(number, 8)
datastore_id = optional(string, "local-lvm")
ip_address = optional(string, "dhcp")
gateway = optional(string, null)
template_url = optional(string)
features = optional(object({
nesting = optional(bool, true)
fuse = optional(bool, false)
keyctl = optional(bool, false)
}), {})
}))
}
```
#### `modules/lxc/outputs.tf`
```hcl
output "lxc_ids" {
description = "Map of LXC names to VM IDs"
value = { for k, v in proxmox_virtual_environment_container.lxc : k => v.vm_id }
}
output "lxc_ips" {
description = "Map of LXC names to IPv4 addresses"
value = { for k, v in proxmox_virtual_environment_container.lxc : k => v.ipv4 }
}
output "lxc_passwords" {
description = "Map of LXC names to root passwords (sensitive)"
value = { for k, v in random_password.lxc_root : k => v.result }
sensitive = true
}
```
#### `environments/dev/main.tf`
```hcl
module "dev_lxcs" {
source = "../../modules/lxc"
proxxmox_node = "mk7"
ssh_public_key = file("~/.ssh/id_ed25519.pub")
lxc_configs = {
"dev-nextcloud" = {
vm_id = 2100
hostname = "dev-nextcloud"
cores = 4
memory = 4096
disk_size = 16
datastore_id = "local-zfs"
ip_address = "192.168.7.100/24"
gateway = "192.168.7.1"
}
"dev-vaultwarden" = {
vm_id = 2101
hostname = "dev-vaultwarden"
cores = 2
memory = 2048
disk_size = 8
datastore_id = "local-zfs"
ip_address = "192.168.7.101/24"
gateway = "192.168.7.1"
}
}
}
```
---
## 6. Resource Reference — `proxmox_virtual_environment_container`
### Critical Arguments
| Block | Key | Required | Default | Description |
|-------|-----|----------|---------|-------------|
| — | `node_name` | ✅ | — | PVE node to create on |
| — | `vm_id` | ✅ | — | Unique numeric ID (100999999999) |
| — | `unprivileged` | ❌ | `true` | Run as unprivileged container |
| `features` | `nesting` | ❌ | `false` | Enable nested containers (needed for Docker-in-LXC) |
| `features` | `fuse` | ❌ | `false` | Enable FUSE mounts |
| `cpu` | `cores` | ❌ | `1` | vCPU cores |
| `memory` | `dedicated` | ❌ | `512` | RAM in MB |
| `disk` | `datastore_id` | ❌ | `local` | Storage pool for rootfs |
| `disk` | `size` | ❌ | `4` | Rootfs size in GB |
| `initialization` | `hostname` | ✅ | — | DNS-compatible hostname |
| `initialization.ip_config.ipv4` | `address` | ✅ | — | CIDR or `dhcp` |
| `initialization.ip_config.ipv4` | `gateway` | ❌ | — | Required for static IP |
| `initialization.user_account` | `keys` | ❌ | — | SSH authorized_keys |
| `network_interface` | `name` | ✅ | — | `veth0` |
| `network_interface` | `bridge` | ❌ | `vmbr0` | Bridge to attach |
| `operating_system` | `template_file_id` | ✅ | — | Downloaded template or `local:vztmpl/…` |
| `operating_system` | `type` | ❌ | `unmanaged` | `ubuntu`, `debian`, `alpine`, etc. |
### Important Notes
- **Template download** uses `proxmox_virtual_environment_download_file` — caches template per-node, avoids re-download
- **Cloud-init** is embedded in the `initialization` block — no separate cloud-init drive needed for LXC
- **Nesting = true** is required for any LXC running Docker or systemd-nspawn
- **Datastore** is backend-agnostic: `local-lvm`, `local-zfs`, `tank-zfs`, `ceph-rbd`, NFS, etc. all work
---
## 7. Data Sources
Use data sources to query existing infrastructure without managing it:
```hcl
data "proxmox_virtual_environment_datastores" "available" {
node_name = "mk7"
}
data "proxmox_virtual_environment_nodes" "cluster" {}
data "proxmox_virtual_environment_container" "existing" {
node_name = "mk7"
vm_id = 2001
}
```
**Common use cases:**
- Validate a datastore exists before creating a disk
- Read an existing LXCs IP to populate a DNS record (Technitium)
- List nodes for multi-node placement logic
---
## 8. State Management
### Recommended: S3-Compatible Backend
Iron Legion already runs self-hosted services. A Garage or Minio instance on Neo/MK7 can serve as the Terraform state backend:
```hcl
terraform {
backend "s3" {
bucket = "iron-legion-terraform"
key = "proxmox-lxc/dev.tfstate"
region = "us-east-1"
endpoint = "https://s3.nb.bobbysh.me"
use_path_style = true
skip_credentials_validation = true
skip_metadata_api_check = true
skip_region_validation = true
skip_requesting_account_id = true
}
}
```
### State Locking (Critical for Team Use)
Add a DynamoDB-compatible table or use a native locking mechanism. If S3 backend does not support locking, wrap `terraform apply` in a CI pipeline that serializes runs.
---
## 9. Operational Workflow
### Day 0 — Bootstrap
```bash
# 1. Clone repo
git clone ssh://git@100.99.123.16:2222/Iron-Legion/terraform-proxmox-lxc.git
cd terraform-proxmox-lxc/environments/dev
# 2. Set credentials
export PROXMOX_VE_PASSWORD="your-pve-password"
# OR for API token:
export PROXMOX_VE_API_TOKEN="root@pam!mytoken=abc123"
# 3. Initialize
terraform init
# 4. Plan
terraform plan -out=tfplan
# 5. Apply
terraform apply tfplan
```
### Day N — Add a Container
1. Add entry to `lxc_configs` map in `environments/dev/main.tf`
2. `terraform plan` — review VM ID collision, IP conflict, storage capacity
3. `terraform apply`
4. Verify: `ssh root@<new-ip>`
### Day N — Destroy a Container
1. Remove entry from `lxc_configs` map
2. `terraform apply` — resource destroyed
3. Or: `terraform destroy -target='module.dev_lxcs.proxmox_virtual_environment_container.lxc["dev-nextcloud"]'`
---
## 10. Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|------|------------|--------|------------|
| VM ID collision | Medium | High | Maintain a fleet-wide VM ID registry; use `proxmox_virtual_environment_vms` data source to check |
| IP overlap with DHCP pool | Medium | High | Reserve static IPs in Technitium DNS; use `dns` data source to verify |
| Template download fails (slow mirror) | Low | Medium | Pre-seed templates on PVE nodes; use `pvesm` to verify before `apply` |
| State file corruption | Low | Critical | S3 versioning + periodic `terraform state pull` backups |
| Privilege escalation via privileged LXC | Low | High | Default `unprivileged = true`; explicit override required |
| Provider breaking change | Medium | Medium | Pin provider version `~> 0.108`; test upgrades in dev environment first |
---
## 11. Open Questions
1. **Do we pre-create cloud-image templates on each PVE node, or let Terraform download per-node?**
- Per-node: slower first deploy, but self-contained
- Pre-seeded: faster, requires manual `pvesm` or Ansible step
2. **Should LXCs register themselves in Technitium DNS via Terraform, or rely on DHCP + DNS integration?**
- Terraform can call a `dns_a_record` module (if Technitium provider exists)
- Or: use PVE's built-in DHCP + DNSMASQ if configured
3. **CI/CD pipeline: GitHub Actions runner, or local Gitea Actions on Neo?**
- Gitea Actions keeps secrets in-network
- GitHub Actions requires Tailscale funnel or external exposure
4. **Do we want a dedicated LXC "Terraform runner" inside the cluster, or run from Artemis/operator workstation?**
- In-cluster runner: always has LAN access to PVE API
- External: requires Tailscale or VPN for API reachability
---
## 12. Appendix
### A. Provider Documentation Links
- **Registry:** https://registry.terraform.io/providers/bpg/proxmox/latest
- **GitHub:** https://github.com/bpg/terraform-provider-proxmox
- **LXC Resource Docs:** https://registry.terraform.io/providers/bpg/proxmox/latest/docs/resources/virtual_environment_container
- **Download File Resource:** https://registry.terraform.io/providers/bpg/proxmox/latest/docs/resources/virtual_environment_download_file
### B. Useful PVE CLI Commands (for verification)
```bash
# List containers on a node
pct list
# List templates
pvesm list local --content vztmpl
# Check datastore usage
pvesm status
# Enter a container
pct enter <vm_id>
```
### C. Terraform Commands Reference
```bash
terraform init # Download providers, configure backend
terraform validate # Syntax check
terraform plan # Preview changes
terraform apply # Execute changes
terraform destroy # Tear down everything
terraform state list # Show managed resources
terraform state show <addr> # Show one resource's attributes
terraform output # Display output values
terraform fmt -recursive # Format all .tf files
```
---
*End of PRD. Ready for Commander Bobby review and approval.*