Draft: Phase 3 PRD - Terraform LXC to Ansible provisioning pipeline
This commit is contained in:
178
PRD Drafts/terraform-lxc-deployment-phase3.md
Normal file
178
PRD Drafts/terraform-lxc-deployment-phase3.md
Normal file
@@ -0,0 +1,178 @@
|
|||||||
|
# Terraform LXC Deployment — Phase 3: Ansible-Integrated Pipeline
|
||||||
|
|
||||||
|
**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-05
|
||||||
|
|
||||||
|
> **Goal:** Extend the validated Phase 2 batch pipeline into a complete **create-and-provision** workflow. Terraform generates LXCs + Ansible inventory; Ansible provisions git, python3-pip, and ansible on each LXC. Future Stage 4 adds N8N orchestration.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Pipeline Overview
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||||
|
│ Trigger │────▶│ Terraform │────▶│ Inventory │────▶│ Ansible │
|
||||||
|
│ (manual / │ │ (Docker) │ │ (YAML) │ │ (Docker) │
|
||||||
|
│ N8N) │ │ Creates │ │ Generated │ │ Provisions │
|
||||||
|
└─────────────┘ │ LXCs on │ │ per apply │ │ LXC group │
|
||||||
|
│ PVE │ └─────────────┘ └─────────────┘
|
||||||
|
└─────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Stage 1: Terraform LXC Batch Factory (Complete)
|
||||||
|
|
||||||
|
**Status:** ✅ Validated at N=4 and N=7 on MK33
|
||||||
|
|
||||||
|
### 2.1 Dynamic Derivation
|
||||||
|
|
||||||
|
| Input | Example | Description |
|
||||||
|
|-------|---------|-------------|
|
||||||
|
| `vmid_base` | `5050` | Starting VMID |
|
||||||
|
| `lxc_count` | `4` | Number of LXCs |
|
||||||
|
| `subnet_prefix` | `192.168` | First two octets |
|
||||||
|
|
||||||
|
**Auto-derived per LXC (index `i`):**
|
||||||
|
- **VMID:** `vmid_base + i`
|
||||||
|
- **Hostname:** `lxc-${vmid}`
|
||||||
|
- **IPv4:** `${subnet_prefix}.${first2(vmid)}.${last2(vmid)}/18`
|
||||||
|
- **IPv4 host (Ansible):** bare IP (CIDR stripped)
|
||||||
|
|
||||||
|
### 2.2 Inventory Generation (NEW)
|
||||||
|
|
||||||
|
Two files written on every `terraform apply`:
|
||||||
|
- `inventory-lxc.yml` — latest, overwritten
|
||||||
|
- `inventory-lxc-<timestamp>.yml` — archive
|
||||||
|
|
||||||
|
Both written to `/ansible-push/terraform-prefill/` via Docker compose mount.
|
||||||
|
|
||||||
|
### 2.3 Generated Inventory Format
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
all:
|
||||||
|
children:
|
||||||
|
lxcs:
|
||||||
|
hosts:
|
||||||
|
lxc-5050:
|
||||||
|
ansible_host: 192.168.50.50
|
||||||
|
ansible_user: root
|
||||||
|
ansible_password: ubuntu
|
||||||
|
ansible_port: 22
|
||||||
|
ansible_ssh_common_args: '-o StrictHostKeyChecking=no'
|
||||||
|
ansible_python_interpreter: auto_silent
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Stage 2: Ansible Provisioning (Complete)
|
||||||
|
|
||||||
|
**Status:** ✅ Validated against 5 LXCs (vmid_base=338, lxc_count=5)
|
||||||
|
|
||||||
|
### 3.1 Playbook Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
~/docker/ansible-push/playbooks/
|
||||||
|
├── main.yml # Entry point
|
||||||
|
├── roles/
|
||||||
|
│ ├── prepare/ # apt update/upgrade
|
||||||
|
│ ├── nfs_client/ # NFS mount (fleet nodes)
|
||||||
|
│ └── lxc_common/ # LXC bootstrap
|
||||||
|
│ └── tasks/main.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.2 lxc_common Role (Updated 2026-06-05)
|
||||||
|
|
||||||
|
Tasks execute in order:
|
||||||
|
|
||||||
|
1. **Ensure apt cache updated** (`no_log: true`)
|
||||||
|
2. **Install git** (`no_log: true`)
|
||||||
|
3. **Install python3-pip** (`no_log: true`)
|
||||||
|
4. **Create jarvis user** (UID 1000, sudo group)
|
||||||
|
5. **Ensure jarvis .ssh directory**
|
||||||
|
6. **Copy root authorized_keys to jarvis**
|
||||||
|
7. **Passwordless sudo for jarvis**
|
||||||
|
8. **Install ansible via pip** (`no_log: true`, `break_system_packages: true`)
|
||||||
|
|
||||||
|
### 3.3 Output Noise Reduction
|
||||||
|
|
||||||
|
`ansible.cfg` at `~/docker/ansible-push/ansible.cfg`:
|
||||||
|
- `stdout_callback = dense` — grid layout instead of raw dpkg
|
||||||
|
- `deprecation_warnings = False` — silence `ansible_os_family` nag
|
||||||
|
|
||||||
|
### 3.4 Execution Pattern
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Terraform creates LXCs + generates inventory
|
||||||
|
cd ~/docker/terraform-pve
|
||||||
|
TF_VAR_vmid_base=5050 TF_VAR_lxc_count=4 ./run.sh apply -auto-approve
|
||||||
|
|
||||||
|
# 2. Fix inventory ownership (terraform container writes as root)
|
||||||
|
sudo chown jarvis:jarvis ~/docker/ansible-push/terraform-prefill/inventory-lxc.yml
|
||||||
|
|
||||||
|
# 3. Ansible provisions
|
||||||
|
cd ~/docker/ansible-push
|
||||||
|
docker compose up -d
|
||||||
|
docker exec -it ansible ansible-playbook playbooks/main.yml \
|
||||||
|
-i terraform-prefill/inventory-lxc.yml \
|
||||||
|
--limit lxcs \
|
||||||
|
--tags lxc_common,prepare
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Open Questions / Phase 4
|
||||||
|
|
||||||
|
| Item | Status | Notes |
|
||||||
|
|------|--------|-------|
|
||||||
|
| Adjustable CPU/RAM/HDD | ❌ Deferred | Currently fixed 1vCPU/2GB/8GB |
|
||||||
|
| Vaulted secrets | ❌ Deferred | `ansible_password` in plaintext inventory |
|
||||||
|
| N8N orchestration | ❌ Deferred | Webhook trigger from Gitea? |
|
||||||
|
| User switch post-bootstrap | ❌ Blocked | First run must be `root`; jarvis created during run |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Known Issues
|
||||||
|
|
||||||
|
### 5.1 PVE Parallel Start Race Condition
|
||||||
|
- Creating multiple LXCs in parallel can hit HTTP 500 "already running"
|
||||||
|
- Transient; re-run `apply` resolves it
|
||||||
|
- No terraform-level workaround needed
|
||||||
|
|
||||||
|
### 5.2 Root-Only First Run
|
||||||
|
- Fresh LXCs only have `root` user with SSH key
|
||||||
|
- `ansible_user: root` required for initial provisioning
|
||||||
|
- `jarvis` user is created during the playbook, not before
|
||||||
|
|
||||||
|
### 5.3 Inventory Ownership
|
||||||
|
- Terraform container runs as `root`, writes inventory as `root`
|
||||||
|
- `jarvis` cannot modify without `chown`
|
||||||
|
- Future fix: run terraform container as `jarvis` UID
|
||||||
|
|
||||||
|
### 5.4 Variable Precedence Trap
|
||||||
|
- `terraform.auto.tfvars` outranks `TF_VAR_*` env vars
|
||||||
|
- Dynamic vars (`lxc_count`, `vmid_base`) must NOT be in `.tfvars`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. File Locations
|
||||||
|
|
||||||
|
| Component | Path |
|
||||||
|
|-----------|------|
|
||||||
|
| Terraform code | `~/docker/terraform-pve/terraform/` |
|
||||||
|
| Ansible code | `~/docker/ansible-push/playbooks/` |
|
||||||
|
| Generated inventory | `~/docker/ansible-push/terraform-prefill/inventory-lxc.yml` |
|
||||||
|
| PRD canonical | `~/documentation/PRDs/terraform-lxc-deployment-batch.md` |
|
||||||
|
| This draft | `~/documentation/PRD Drafts/terraform-lxc-deployment-phase3.md` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Decision Log
|
||||||
|
|
||||||
|
| Decision | Chosen | Date |
|
||||||
|
|----------|--------|------|
|
||||||
|
| `ansible_user` | `root` for all runs | 2026-06-05 |
|
||||||
|
| `ansible_password` | `ubuntu` (matches fleet) | 2026-06-05 |
|
||||||
|
| SSH key discovery | Container mount `/root/.ssh/` auto-discovers `id_ed25519` | 2026-06-05 |
|
||||||
|
| `no_log` on apt | Enabled to suppress dpkg noise | 2026-06-05 |
|
||||||
|
| `dense` callback | Enabled in `ansible.cfg` | 2026-06-05 |
|
||||||
|
| Inventory output | Dual: `inventory-lxc.yml` + timestamped archive | 2026-06-05 |
|
||||||
Reference in New Issue
Block a user