Files
documentation/PRD Drafts/terraform-lxc-deployment-phase3.md

6.1 KiB

Terraform LXC Deployment — Phase 3: Ansible-Integrated Pipeline

Status: Draft | Author: Artemis | Date: 2026-06-05

Goal: Extend the validated Phase 2 batch pipeline into a complete create-and-provision workflow. Terraform generates LXCs + Ansible inventory; Ansible provisions git, python3-pip, and ansible on each LXC. Future Stage 4 adds N8N orchestration.


1. Pipeline Overview

┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Trigger   │────▶│  Terraform  │────▶│  Inventory  │────▶│   Ansible   │
│  (manual /  │     │  (Docker)   │     │  (YAML)     │     │  (Docker)   │
│   N8N)      │     │  Creates    │     │  Generated  │     │  Provisions │
└─────────────┘     │  LXCs on    │     │  per apply  │     │  LXC group  │
                    │  PVE        │     └─────────────┘     └─────────────┘
                    └─────────────┘

2. Stage 1: Terraform LXC Batch Factory (Complete)

Status: Validated at N=4 and N=7 on MK33

2.1 Dynamic Derivation

Input Example Description
vmid_base 5050 Starting VMID
lxc_count 4 Number of LXCs
subnet_prefix 192.168 First two octets

Auto-derived per LXC (index i):

  • VMID: vmid_base + i
  • Hostname: lxc-${vmid}
  • IPv4: ${subnet_prefix}.${first2(vmid)}.${last2(vmid)}/18
  • IPv4 host (Ansible): bare IP (CIDR stripped)

2.2 Inventory Generation (NEW)

Two files written on every terraform apply:

  • inventory-lxc.yml — latest, overwritten
  • inventory-lxc-<timestamp>.yml — archive

Both written to /ansible-push/terraform-prefill/ via Docker compose mount.

2.3 Generated Inventory Format

all:
  children:
    lxcs:
      hosts:
        lxc-5050:
          ansible_host: 192.168.50.50
          ansible_user: root
          ansible_password: ubuntu
          ansible_port: 22
          ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
          ansible_python_interpreter: auto_silent

3. Stage 2: Ansible Provisioning (Complete)

Status: Validated against 5 LXCs (vmid_base=338, lxc_count=5)

3.1 Playbook Structure

~/docker/ansible-push/playbooks/
├── main.yml                    # Entry point
├── roles/
│   ├── prepare/                # apt update/upgrade
│   ├── nfs_client/             # NFS mount (fleet nodes)
│   └── lxc_common/             # LXC bootstrap
│       └── tasks/main.yml

3.2 lxc_common Role (Updated 2026-06-05)

Tasks execute in order:

  1. Ensure apt cache updated (no_log: true)
  2. Install git (no_log: true)
  3. Install python3-pip (no_log: true)
  4. Create jarvis user (UID 1000, sudo group)
  5. Ensure jarvis .ssh directory
  6. Copy root authorized_keys to jarvis
  7. Passwordless sudo for jarvis
  8. Install ansible via pip (no_log: true, break_system_packages: true)

3.3 Output Noise Reduction

ansible.cfg at ~/docker/ansible-push/ansible.cfg:

  • stdout_callback = dense — grid layout instead of raw dpkg
  • deprecation_warnings = False — silence ansible_os_family nag

3.4 Execution Pattern

# 1. Terraform creates LXCs + generates inventory
cd ~/docker/terraform-pve
TF_VAR_vmid_base=5050 TF_VAR_lxc_count=4 ./run.sh apply -auto-approve

# 2. Fix inventory ownership (terraform container writes as root)
sudo chown jarvis:jarvis ~/docker/ansible-push/terraform-prefill/inventory-lxc.yml

# 3. Ansible provisions
cd ~/docker/ansible-push
docker compose up -d
docker exec -it ansible ansible-playbook playbooks/main.yml \
  -i terraform-prefill/inventory-lxc.yml \
  --limit lxcs \
  --tags lxc_common,prepare

4. Open Questions / Phase 4

Item Status Notes
Adjustable CPU/RAM/HDD Deferred Currently fixed 1vCPU/2GB/8GB
Vaulted secrets Deferred ansible_password in plaintext inventory
N8N orchestration Deferred Webhook trigger from Gitea?
User switch post-bootstrap Blocked First run must be root; jarvis created during run

5. Known Issues

5.1 PVE Parallel Start Race Condition

  • Creating multiple LXCs in parallel can hit HTTP 500 "already running"
  • Transient; re-run apply resolves it
  • No terraform-level workaround needed

5.2 Root-Only First Run

  • Fresh LXCs only have root user with SSH key
  • ansible_user: root required for initial provisioning
  • jarvis user is created during the playbook, not before

5.3 Inventory Ownership

  • Terraform container runs as root, writes inventory as root
  • jarvis cannot modify without chown
  • Future fix: run terraform container as jarvis UID

5.4 Variable Precedence Trap

  • terraform.auto.tfvars outranks TF_VAR_* env vars
  • Dynamic vars (lxc_count, vmid_base) must NOT be in .tfvars

6. File Locations

Component Path
Terraform code ~/docker/terraform-pve/terraform/
Ansible code ~/docker/ansible-push/playbooks/
Generated inventory ~/docker/ansible-push/terraform-prefill/inventory-lxc.yml
PRD canonical ~/documentation/PRDs/terraform-lxc-deployment-batch.md
This draft ~/documentation/PRD Drafts/terraform-lxc-deployment-phase3.md

7. Decision Log

Decision Chosen Date
ansible_user root for all runs 2026-06-05
ansible_password ubuntu (matches fleet) 2026-06-05
SSH key discovery Container mount /root/.ssh/ auto-discovers id_ed25519 2026-06-05
no_log on apt Enabled to suppress dpkg noise 2026-06-05
dense callback Enabled in ansible.cfg 2026-06-05
Inventory output Dual: inventory-lxc.yml + timestamped archive 2026-06-05