20 KiB
PRD: Terraform LXC Automation for Proxmox VE 9.2
Status: Draft — Pending Commander Bobby Review
Author: F.R.I.D.A.Y.
Date: 2026-06-01
Provider: bpg/proxmox (actively maintained, 11M+ downloads)
Target: Proxmox VE 9.2 / Debian Trixie
1. Purpose & Scope
This PRD defines the architecture, configuration patterns, and operational workflow for automating LXC container lifecycle management on Proxmox VE 9.2 clusters using Terraform and the actively maintained bpg/proxmox provider.
In scope:
- Terraform provider configuration and authentication
- LXC resource definitions (
proxmox_virtual_environment_container) - Cloud-init / template-based provisioning
- Network configuration (static IP, DHCP, bridge)
- Storage allocation (rootfs on any PVE backend)
- State management and CI/CD integration patterns
Out of scope:
- VM (QEMU/KVM) provisioning
- PVE cluster topology changes
- Backup/restore automation (separate PRD)
2. Success Criteria
| # | Criterion | How Verified |
|---|---|---|
| 1 | A single terraform apply creates a working LXC with SSH access |
ssh root@<lxc-ip> succeeds |
| 2 | LXCs are provisioned from official cloud-image templates | Template downloaded via proxmox_virtual_environment_download_file |
| 3 | Network is configurable per-LXC (DHCP or static CIDR) | ip addr inside container matches TF config |
| 4 | Rootfs lives on user-selected storage (not hardcoded to local-lvm) |
pvesm status shows volume on target datastore |
| 5 | State is stored remotely (S3-compatible or Terraform Cloud) | terraform state list works from any machine |
| 6 | Destroy and recreate is idempotent | terraform destroy && terraform apply yields identical result |
3. Provider Selection
Why bpg/proxmox (not telmate/proxmox)
| Provider | Maintenance | Downloads | LXC Support | Notes |
|---|---|---|---|---|
bpg/proxmox |
✅ Active (v0.108.0, June 2026) | 11.8M+ | Full | Community-tier, comprehensive docs, supports PVE 9.x |
telmate/proxmox |
❌ Stale (last release ~2023) | Legacy | Partial | Deprecated; lacks PVE 9.x features |
Decision: Use bpg/proxmox exclusively. The telmate provider is unmaintained and incompatible with PVE 9.2 API changes.
Provider block (minimum):
terraform {
required_providers {
proxmox = {
source = "bpg/proxmox"
version = "~> 0.108"
}
}
}
provider "proxmox" {
endpoint = "https://192.168.7.33:8006/"
username = "root@pam"
password = var.proxmox_password # or PROXMOX_VE_PASSWORD env var
insecure = true # self-signed TLS
}
4. Authentication Matrix
| Method | Use Case | Config | Security |
|---|---|---|---|
| API Token | Production, CI/CD | api_token = "root@pam!mytoken=abc123…" |
Highest — revocable, fine-grained |
| Username/Password | Development, one-offs | username = "root@pam", password = "…" |
Medium — password in env |
| Auth Ticket | TOTP-enabled accounts | Pre-authenticate, pass ticket | High — short-lived |
Recommendation for Iron Legion:
- Development: Use
PROXMOX_VE_PASSWORDenvironment variable - CI/CD (future): Create a PVE API token with
PVEFarmAdminor custom role, store in CI secrets
5. Sample Project Structure
terraform-proxmox-lxc/
├── README.md
├── main.tf # Provider + backend config
├── variables.tf # Input variables
├── terraform.tfvars.example # Sample values (gitignored)
├── outputs.tf # Useful outputs (IPs, IDs)
├── versions.tf # Required providers + TF version
├── modules/
│ └── lxc/
│ ├── main.tf # proxmox_virtual_environment_container resource
│ ├── variables.tf # Module inputs
│ └── outputs.tf # Module outputs
├── environments/
│ ├── dev/
│ │ ├── main.tf # Calls modules with dev vars
│ │ └── terraform.tfvars
│ └── prod/
│ ├── main.tf
│ └── terraform.tfvars
└── templates/
└── ubuntu-25.04-cloudimg.yaml # Cloud-init user-data (optional)
Key Files
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
proxmox = {
source = "bpg/proxmox"
version = "~> 0.108"
}
random = {
source = "hashicorp/random"
version = "~> 3.6"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
# Remote state — S3-compatible (Minio, Garage, AWS S3)
backend "s3" {
bucket = "iron-legion-terraform"
key = "proxmox-lxc/terraform.tfstate"
region = "us-east-1"
endpoint = "https://s3.nb.bobbysh.me"
use_path_style = true
# Skip AWS-specific validations for self-hosted S3
skip_credentials_validation = true
skip_metadata_api_check = true
skip_region_validation = true
skip_requesting_account_id = true
}
}
variables.tf
variable "proxmox_endpoint" {
description = "PVE API URL"
type = string
default = "https://192.168.7.33:8006/"
}
variable "proxmox_node" {
description = "Target PVE node name"
type = string
default = "mk33"
}
variable "ssh_public_key" {
description = "SSH public key for root access"
type = string
}
variable "lxc_configs" {
description = "Map of LXC configurations"
type = map(object({
vm_id = number
hostname = string
cores = optional(number, 2)
memory = optional(number, 2048)
disk_size = optional(number, 8)
datastore_id = optional(string, "local-lvm")
ip_address = optional(string, "dhcp")
gateway = optional(string, null)
template_url = optional(string, "https://mirrors.servercentral.com/ubuntu-cloud-images/releases/25.04/release/ubuntu-25.04-server-cloudimg-amd64-root.tar.xz")
features = optional(object({
nesting = optional(bool, true)
fuse = optional(bool, false)
keyctl = optional(bool, false)
}), {})
}))
}
modules/lxc/main.tf
resource "proxmox_virtual_environment_download_file" "lxc_template" {
for_each = var.lxc_configs
content_type = "vztmpl"
datastore_id = "local"
node_name = var.proxmox_node
url = each.value.template_url
file_name = "${each.key}-template.tar.xz"
overwrite = false
}
resource "proxmox_virtual_environment_container" "lxc" {
for_each = var.lxc_configs
node_name = var.proxmox_node
vm_id = each.value.vm_id
description = "Managed by Terraform — ${each.key}"
unprivileged = true
features {
nesting = each.value.features.nesting
fuse = each.value.features.fuse
keyctl = each.value.features.keyctl
}
cpu {
cores = each.value.cores
units = 1024
}
memory {
dedicated = each.value.memory
swap = 0
}
disk {
datastore_id = each.value.datastore_id
size = each.value.disk_size
}
initialization {
hostname = each.value.hostname
ip_config {
ipv4 {
address = each.value.ip_address
gateway = each.value.gateway
}
}
user_account {
keys = [var.ssh_public_key]
password = random_password.lxc_root[each.key].result
}
}
network_interface {
name = "veth0"
bridge = "vmbr0"
}
operating_system {
template_file_id = proxmox_virtual_environment_download_file.lxc_template[each.key].id
type = "ubuntu"
}
startup {
order = "3"
up_delay = "60"
down_delay = "60"
}
depends_on = [proxmox_virtual_environment_download_file.lxc_template]
}
resource "random_password" "lxc_root" {
for_each = var.lxc_configs
length = 16
special = true
override_special = "_%@"
}
modules/lxc/variables.tf
variable "proxmox_node" {
type = string
}
variable "ssh_public_key" {
type = string
}
variable "lxc_configs" {
type = map(object({
vm_id = number
hostname = string
cores = optional(number, 2)
memory = optional(number, 2048)
disk_size = optional(number, 8)
datastore_id = optional(string, "local-lvm")
ip_address = optional(string, "dhcp")
gateway = optional(string, null)
template_url = optional(string)
features = optional(object({
nesting = optional(bool, true)
fuse = optional(bool, false)
keyctl = optional(bool, false)
}), {})
}))
}
modules/lxc/outputs.tf
output "lxc_ids" {
description = "Map of LXC names to VM IDs"
value = { for k, v in proxmox_virtual_environment_container.lxc : k => v.vm_id }
}
output "lxc_ips" {
description = "Map of LXC names to IPv4 addresses"
value = { for k, v in proxmox_virtual_environment_container.lxc : k => v.ipv4 }
}
output "lxc_passwords" {
description = "Map of LXC names to root passwords (sensitive)"
value = { for k, v in random_password.lxc_root : k => v.result }
sensitive = true
}
environments/dev/main.tf
module "dev_lxcs" {
source = "../../modules/lxc"
proxxmox_node = "mk33"
ssh_public_key = file("~/.ssh/id_ed25519.pub")
lxc_configs = {
"dev-nextcloud" = {
vm_id = 2100
hostname = "dev-nextcloud"
cores = 4
memory = 4096
disk_size = 16
datastore_id = "local-zfs"
ip_address = "192.168.7.100/24"
gateway = "192.168.7.1"
}
"dev-vaultwarden" = {
vm_id = 2101
hostname = "dev-vaultwarden"
cores = 2
memory = 2048
disk_size = 8
datastore_id = "local-zfs"
ip_address = "192.168.7.101/24"
gateway = "192.168.7.1"
}
}
}
6. Resource Reference — proxmox_virtual_environment_container
Critical Arguments
| Block | Key | Required | Default | Description |
|---|---|---|---|---|
| — | node_name |
✅ | — | PVE node to create on |
| — | vm_id |
✅ | — | Unique numeric ID (100–999999999) |
| — | unprivileged |
❌ | true |
Run as unprivileged container |
features |
nesting |
❌ | false |
Enable nested containers (needed for Docker-in-LXC) |
features |
fuse |
❌ | false |
Enable FUSE mounts |
cpu |
cores |
❌ | 1 |
vCPU cores |
memory |
dedicated |
❌ | 512 |
RAM in MB |
disk |
datastore_id |
❌ | local |
Storage pool for rootfs |
disk |
size |
❌ | 4 |
Rootfs size in GB |
initialization |
hostname |
✅ | — | DNS-compatible hostname |
initialization.ip_config.ipv4 |
address |
✅ | — | CIDR or dhcp |
initialization.ip_config.ipv4 |
gateway |
❌ | — | Required for static IP |
initialization.user_account |
keys |
❌ | — | SSH authorized_keys |
network_interface |
name |
✅ | — | veth0 |
network_interface |
bridge |
❌ | vmbr0 |
Bridge to attach |
operating_system |
template_file_id |
✅ | — | Downloaded template or local:vztmpl/… |
operating_system |
type |
❌ | unmanaged |
ubuntu, debian, alpine, etc. |
Important Notes
- Template download uses
proxmox_virtual_environment_download_file— caches template per-node, avoids re-download - Cloud-init is embedded in the
initializationblock — no separate cloud-init drive needed for LXC - Nesting = true is required for any LXC running Docker or systemd-nspawn
- Datastore is backend-agnostic:
local-lvm,local-zfs,tank-zfs,ceph-rbd, NFS, etc. all work
7. Data Sources
Use data sources to query existing infrastructure without managing it:
data "proxmox_virtual_environment_datastores" "available" {
node_name = var.proxmox_node
}
data "proxmox_virtual_environment_nodes" "cluster" {}
data "proxmox_virtual_environment_container" "existing" {
node_name = var.proxmox_node # or specify target node explicitly
vm_id = 2001
}
Common use cases:
- Validate a datastore exists before creating a disk
- Read an existing LXC’s IP to populate a DNS record (Technitium)
- List nodes for multi-node placement logic
8. State Management
Recommended: S3-Compatible Backend
Iron Legion already runs self-hosted services. A Garage or Minio instance on a fleet storage node (e.g., Neo) can serve as the Terraform state backend:
terraform {
backend "s3" {
bucket = "iron-legion-terraform"
key = "proxmox-lxc/dev.tfstate"
region = "us-east-1"
endpoint = "https://s3.nb.bobbysh.me"
use_path_style = true
skip_credentials_validation = true
skip_metadata_api_check = true
skip_region_validation = true
skip_requesting_account_id = true
}
}
State Locking (Critical for Team Use)
Add a DynamoDB-compatible table or use a native locking mechanism. If S3 backend does not support locking, wrap terraform apply in a CI pipeline that serializes runs.
Optional: Atlantis Web UI for Terraform PR Automation
What Atlantis Is
Atlantis is a self-hosted web application that listens for webhook events from Git repositories and runs terraform plan / terraform apply automatically inside PR/MR workflows. It posts plan output back to the PR as comments, enforces approval gates, and locks workspaces to prevent concurrent applies.
Can Atlantis Manage LXC Resources via bpg/proxmox?
Yes. Atlantis is a Terraform orchestration layer, not a provider. It supports any Terraform provider including bpg/proxmox. The workflow is:
- Developer opens a PR adding/modifying
.tffiles defining LXC containers - Atlantis receives the webhook and runs
terraform planin a isolated directory - Plan output posted as a PR comment — team reviews before approval
- After approval (or
atlantis applycomment), Atlantis runsterraform apply
Atlantis Docker Compose (Self-Hosted)
services:
atlantis:
image: ghcr.io/runatlantis/atlantis:latest
ports:
- "4141:4141"
volumes:
- ${HOME}/.ssh:/home/atlantis/.ssh:ro # Git SSH key
- /var/run/docker.sock:/var/run/docker.sock:ro # if using Docker TF provider
- atlantis-data:/home/atlantis/.atlantis
environment:
ATLANTIS_GH_USER: "iron-legion-bot" # or ATLANTIS_GITLAB_USER / ATLANTIS_GITEA_USER
ATLANTIS_GH_TOKEN: "${ATLANTIS_GH_TOKEN}" # personal access token
ATLANTIS_REPO_ALLOWLIST: "github.com/Iron-Legion/*"
ATLANTIS_GH_WEBHOOK_SECRET: "${WEBHOOK_SECRET}"
# For Gitea:
# ATLANTIS_GITEA_USER: "iron-legion-bot"
# ATLANTIS_GITEA_TOKEN: "${GITEA_TOKEN}"
# ATLANTIS_GITEA_WEBHOOK_SECRET: "${WEBHOOK_SECRET}"
command: server
restart: unless-stopped
# Optional: Redis for distributed locking in multi-replica setups
# redis:
# image: redis:8-alpine
# volumes:
# - redis-data:/data
# restart: always
volumes:
atlantis-data:
driver: local
Key Features
- Plan Comments: Every PR gets an auto-generated
terraform plancomment - Apply Locking: One apply at a time per workspace; concurrent PRs queue
- Policy Checks: Integrate OPA (Open Policy Agent) or custom scripts to block non-compliant changes
- Custom Workflows: Define per-repo or per-directory workflows (e.g., plan-only for dev, auto-apply for staging)
- Self-Hosted SCM: Native webhook support for GitHub, GitLab, Bitbucket, and Gitea
Resource Footprint
- Atlantis container: ~100–200 MB RAM, minimal CPU
- Optional Redis: ~20 MB RAM
- Total: fits comfortably on any Iron Legion node (MK7, MK33–42, Neo)
Gitea Integration Notes
- Atlantis supports Gitea via the
--gitea-user,--gitea-token,--gitea-webhook-secretflags - Must expose Atlantis endpoint to Gitea (Tailscale funnel, reverse proxy, or LAN if Gitea is in-network)
- Webhook URL:
http://atlantis-host:4141/events
9. Operational Workflow
Day 0 — Bootstrap
# 1. Clone repo
git clone ssh://git@100.99.123.16:2222/Iron-Legion/terraform-proxmox-lxc.git
cd terraform-proxmox-lxc/environments/dev
# 2. Set credentials
export PROXMOX_VE_PASSWORD="your-pve-password"
# OR for API token:
export PROXMOX_VE_API_TOKEN="root@pam!mytoken=abc123"
# 3. Initialize
terraform init
# 4. Plan
terraform plan -out=tfplan
# 5. Apply
terraform apply tfplan
Day N — Add a Container
- Add entry to
lxc_configsmap inenvironments/dev/main.tf terraform plan— review VM ID collision, IP conflict, storage capacityterraform apply- Verify:
ssh root@<new-ip>
Day N — Destroy a Container
- Remove entry from
lxc_configsmap terraform apply— resource destroyed- Or:
terraform destroy -target='module.dev_lxcs.proxmox_virtual_environment_container.lxc["dev-nextcloud"]'
10. Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| VM ID collision | Medium | High | Maintain a fleet-wide VM ID registry; use proxmox_virtual_environment_vms data source to check |
| IP overlap with DHCP pool | Medium | High | Reserve static IPs in Technitium DNS; use dns data source to verify |
| Template download fails (slow mirror) | Low | Medium | Pre-seed templates on PVE nodes; use pvesm to verify before apply |
| State file corruption | Low | Critical | S3 versioning + periodic terraform state pull backups |
| Privilege escalation via privileged LXC | Low | High | Default unprivileged = true; explicit override required |
| Provider breaking change | Medium | Medium | Pin provider version ~> 0.108; test upgrades in dev environment first |
11. Open Questions
-
Do we pre-create cloud-image templates on each PVE node, or let Terraform download per-node?
- Per-node: slower first deploy, but self-contained
- Pre-seeded: faster, requires manual
pvesmor Ansible step
-
Should LXCs register themselves in Technitium DNS via Terraform, or rely on DHCP + DNS integration?
- Terraform can call a
dns_a_recordmodule (if Technitium provider exists) - Or: use PVE's built-in DHCP + DNSMASQ if configured
- Terraform can call a
-
CI/CD pipeline: GitHub Actions runner, or local Gitea Actions on the fleet SCM host?
- Gitea Actions keeps secrets in-network
- GitHub Actions requires Tailscale funnel or external exposure
-
Do we want a dedicated LXC "Terraform runner" inside the cluster, or run from Artemis/operator workstation?
- In-cluster runner: always has LAN access to PVE API
- External: requires Tailscale or VPN for API reachability
12. Appendix
A. Provider Documentation Links
- Registry: https://registry.terraform.io/providers/bpg/proxmox/latest
- GitHub: https://github.com/bpg/terraform-provider-proxmox
- LXC Resource Docs: https://registry.terraform.io/providers/bpg/proxmox/latest/docs/resources/virtual_environment_container
- Download File Resource: https://registry.terraform.io/providers/bpg/proxmox/latest/docs/resources/virtual_environment_download_file
B. Useful PVE CLI Commands (for verification)
# List containers on a node
pct list
# List templates
pvesm list local --content vztmpl
# Check datastore usage
pvesm status
# Enter a container
pct enter <vm_id>
C. Terraform Commands Reference
terraform init # Download providers, configure backend
terraform validate # Syntax check
terraform plan # Preview changes
terraform apply # Execute changes
terraform destroy # Tear down everything
terraform state list # Show managed resources
terraform state show <addr> # Show one resource's attributes
terraform output # Display output values
terraform fmt -recursive # Format all .tf files
End of PRD. Ready for Commander Bobby review and approval.