From df965892d5a0935a7d81d3eb4bbbebe7d5fefb13 Mon Sep 17 00:00:00 2001 From: "F.R.I.D.A.Y." Date: Fri, 5 Jun 2026 21:33:47 -0400 Subject: [PATCH] Draft: N8N webhook orchestrator for terraform LXC + ansible provisioning --- .../n8n-terraform-ansible-orchestrator.md | 164 ++++++++++++++++++ 1 file changed, 164 insertions(+) create mode 100644 PRD Drafts/n8n-terraform-ansible-orchestrator.md diff --git a/PRD Drafts/n8n-terraform-ansible-orchestrator.md b/PRD Drafts/n8n-terraform-ansible-orchestrator.md new file mode 100644 index 0000000..fe9bc22 --- /dev/null +++ b/PRD Drafts/n8n-terraform-ansible-orchestrator.md @@ -0,0 +1,164 @@ +# N8N Webhook Orchestrator — Terraform LXC + Ansible Provisioning + +**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-05 + +> **Purpose:** N8N on MK7 receives Telegram-triggered webhooks, SSHs to Artemis, and executes existing terraform/ansible containers. No new infrastructure — orchestrates what already exists. + +--- + +## 1. Architecture + +``` +[Telegram: Bobby] → Artemis (parse intent) → POST to N8N (MK7) + ↓ SSH (jarvis@artemis.ai.home) + Artemis (this machine) + ↓ + [A] ~/docker/terraform-pve/run.sh apply + ↓ + LXC created + inventory generated + ↓ + [B] ~/docker/ansible-push/lxc-common.sh + ↓ + LXC provisioned (jarvis + git + ansible) +``` + +**N8N role:** Trigger + SSH executor only. No Docker socket, no state awareness, no config generation. + +**Artemis role:** Hosts existing run.sh + lxc-common.sh. Owns terraform state, ansible inventory, SSH keys. + +--- + +## 2. Workflow A: `/build` — Create and Provision LXCs + +### 2.1 Telegram Input +``` +You: "/build 5 lxcs starting at vmid 62128" +Artemis parses → vmid_base=62128, count=5, specs=default +``` + +### 2.2 Webhook Payload (POST to N8N) +```json +{ + "action": "lxc_build", + "vmid_base": 62128, + "lxc_count": 5, + "specs": "default" +} +``` + +### 2.3 N8N Execution Steps + +| Step | Node | Command | +|------|------|---------| +| 1 | Webhook trigger | Receive JSON payload | +| 2 | Set SSH env vars | Export `TF_VAR_lxc_count=5 TF_VAR_vmid_base=62128` | +| 3 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/terraform-pve && ./run.sh apply -auto-approve"` | +| 4 | Wait | Poll until `run.sh` exits (blocks until completion) | +| 5 | Verify inventory | Check `~/docker/ansible-push/terraform-prefill/inventory-lxc.yml` exists | +| 6 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/ansible-push && ./lxc-common.sh"` | +| 7 | Notify | POST result back to Telegram/Discord | + +### 2.4 Constraints + +- **Specs locked to "default" for POC** (2 cores, 2GB RAM, 8GB disk) +- **Custom specs deferred to Phase 4** — requires terraform variable expansion +- **vmid_base range:** Must not overlap existing PVE VMs/LXCs (check before apply) +- **lxc_count max:** Phase 2 validated at N=7; N=4 had transient 500 race condition + +--- + +## 3. Workflow B: `/fleet-update` — Apt Update + Upgrade + +### 3.1 Telegram Input +``` +You: "/fleet-update" +Artemis parses → action=fleet_update +``` + +### 3.2 Webhook Payload (POST to N8N) +```json +{ + "action": "fleet_update" +} +``` + +### 3.3 N8N Execution Steps + +| Step | Node | Command | +|------|------|---------| +| 1 | Webhook trigger | Receive JSON payload | +| 2 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/ansible-push && docker compose up -d && docker exec ansible ansible-playbook playbooks/main.yml -i inventory.yml --tags fleet_update"` | +| 3 | Wait | Poll until ansible exits | +| 4 | Notify | POST result back to Telegram/Discord | + +### 3.4 Target Scope + +| Included | Excluded | +|----------|----------| +| `managed_nodes` group (from inventory.yml) | `pve_hosts` (MK33/34/39) — PVE self-manages | +| `physical_agents` | Neo (ZimaOS, not Debian) | +| `core_services` (MK7) | `igor` (ZimaOS NAS) | +| | Ephemeral LXCs — rebuilt from scratch | + +--- + +## 4. N8N Requirements (MK7) + +### 4.1 Container Mounts +- **SSH client:** `openssh-client` package installed in N8N image +- **Private key:** Mount `~/.ssh/artemis_key` → `/root/.ssh/id_ed25519` inside N8N container +- **Known hosts:** Pre-populated `~/.ssh/known_hosts` for `artemis.ai.home` + +### 4.2 N8N Credentials +- **SSH Private Key:** Store `artemis_key` in N8N "Credentials" → SSH type +- **SSH Host:** `artemis.ai.home` (or LAN IP `192.168.15.182`) +- **SSH User:** `jarvis` +- **SSH Port:** `22` + +### 4.3 Security Constraints +- N8N connects **to Artemis only** — never to PVE nodes, Neo, or LXCs directly +- N8N never sees PVE API tokens or sudo passwords +- All terraform/ansible state stays on Artemis filesystem (not in N8N container) + +--- + +## 5. Artemis Prerequisites (Already Exists) + +| Component | Path | Status | +|-----------|------|--------| +| Terraform container | `~/docker/terraform-pve/` | ✅ Validated Phase 2 | +| Ansible container | `~/docker/ansible-push/` | ✅ Validated | +| Run script | `./run.sh` | ✅ Forwards TF_VAR_*, supports init/plan/apply/destroy | +| LXC provision script | `./lxc-common.sh` | ✅ Runs lxc_common role | +| Inventory template | `terraform/inventory-lxc.tmpl` | ✅ Auto-generates ansible_host | + +--- + +## 6. Error Handling + +| Scenario | N8N Action | +|----------|------------| +| Terraform apply fails | Abort, notify with stderr | +| Inventory not generated after apply | Retry once, then fail | +| Ansible unreachable | Report per-host, continue others | +| SSH connection refused | Retry 3× with backoff, then alert | + +--- + +## 7. Open Questions + +1. **Should `/build` auto-increment `vmid_base` from last used, or always require explicit input?** +2. **Should N8N trigger a Gitea commit of generated inventory after apply?** +3. **Should `/fleet-update` include PVE nodes via `apt` (not `dist-upgrade`) differently?** +4. **N8N webhook URL exposed via Tailscale or local LAN only?** + +--- + +## 8. Decision Points + +| Decision | Options | Recommended | +|----------|---------|-------------| +| N8N SSH key | `artemis_key` vs dedicated `n8n_key` | `artemis_key` for POC; rotate to dedicated key later | +| Notification target | Telegram vs Discord vs both | Both via existing gateway webhook | +| vmid_base tracking | Manual vs auto-increment vs check-before-apply | Manual for POC; auto-track in Phase 4 | +| Fleet-update schedule | On-demand only vs weekly cron | On-demand only via `/fleet-update` |