165 lines
5.9 KiB
Markdown
165 lines
5.9 KiB
Markdown
# N8N Webhook Orchestrator — Terraform LXC + Ansible Provisioning
|
||
|
||
**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-05
|
||
|
||
> **Purpose:** N8N on MK7 receives Telegram-triggered webhooks, SSHs to Artemis, and executes existing terraform/ansible containers. No new infrastructure — orchestrates what already exists.
|
||
|
||
---
|
||
|
||
## 1. Architecture
|
||
|
||
```
|
||
[Telegram: Bobby] → Artemis (parse intent) → POST to N8N (MK7)
|
||
↓ SSH (jarvis@artemis.ai.home)
|
||
Artemis (this machine)
|
||
↓
|
||
[A] ~/docker/terraform-pve/run.sh apply
|
||
↓
|
||
LXC created + inventory generated
|
||
↓
|
||
[B] ~/docker/ansible-push/lxc-common.sh
|
||
↓
|
||
LXC provisioned (jarvis + git + ansible)
|
||
```
|
||
|
||
**N8N role:** Trigger + SSH executor only. No Docker socket, no state awareness, no config generation.
|
||
|
||
**Artemis role:** Hosts existing run.sh + lxc-common.sh. Owns terraform state, ansible inventory, SSH keys.
|
||
|
||
---
|
||
|
||
## 2. Workflow A: `/build` — Create and Provision LXCs
|
||
|
||
### 2.1 Telegram Input
|
||
```
|
||
You: "/build 5 lxcs starting at vmid 62128"
|
||
Artemis parses → vmid_base=62128, count=5, specs=default
|
||
```
|
||
|
||
### 2.2 Webhook Payload (POST to N8N)
|
||
```json
|
||
{
|
||
"action": "lxc_build",
|
||
"vmid_base": 62128,
|
||
"lxc_count": 5,
|
||
"specs": "default"
|
||
}
|
||
```
|
||
|
||
### 2.3 N8N Execution Steps
|
||
|
||
| Step | Node | Command |
|
||
|------|------|---------|
|
||
| 1 | Webhook trigger | Receive JSON payload |
|
||
| 2 | Set SSH env vars | Export `TF_VAR_lxc_count=5 TF_VAR_vmid_base=62128` |
|
||
| 3 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/terraform-pve && ./run.sh apply -auto-approve"` |
|
||
| 4 | Wait | Poll until `run.sh` exits (blocks until completion) |
|
||
| 5 | Verify inventory | Check `~/docker/ansible-push/terraform-prefill/inventory-lxc.yml` exists |
|
||
| 6 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/ansible-push && ./lxc-common.sh"` |
|
||
| 7 | Notify | POST result back to Telegram/Discord |
|
||
|
||
### 2.4 Constraints
|
||
|
||
- **Specs locked to "default" for POC** (2 cores, 2GB RAM, 8GB disk)
|
||
- **Custom specs deferred to Phase 4** — requires terraform variable expansion
|
||
- **vmid_base range:** Must not overlap existing PVE VMs/LXCs (check before apply)
|
||
- **lxc_count max:** Phase 2 validated at N=7; N=4 had transient 500 race condition
|
||
|
||
---
|
||
|
||
## 3. Workflow B: `/fleet-update` — Apt Update + Upgrade
|
||
|
||
### 3.1 Telegram Input
|
||
```
|
||
You: "/fleet-update"
|
||
Artemis parses → action=fleet_update
|
||
```
|
||
|
||
### 3.2 Webhook Payload (POST to N8N)
|
||
```json
|
||
{
|
||
"action": "fleet_update"
|
||
}
|
||
```
|
||
|
||
### 3.3 N8N Execution Steps
|
||
|
||
| Step | Node | Command |
|
||
|------|------|---------|
|
||
| 1 | Webhook trigger | Receive JSON payload |
|
||
| 2 | Execute SSH | `ssh jarvis@artemis.ai.home "cd ~/docker/ansible-push && docker compose up -d && docker exec ansible ansible-playbook playbooks/main.yml -i inventory.yml --tags fleet_update"` |
|
||
| 3 | Wait | Poll until ansible exits |
|
||
| 4 | Notify | POST result back to Telegram/Discord |
|
||
|
||
### 3.4 Target Scope
|
||
|
||
| Included | Excluded |
|
||
|----------|----------|
|
||
| `managed_nodes` group (from inventory.yml) | `pve_hosts` (MK33/34/39) — PVE self-manages |
|
||
| `physical_agents` | Neo (ZimaOS, not Debian) |
|
||
| `core_services` (MK7) | `igor` (ZimaOS NAS) |
|
||
| | Ephemeral LXCs — rebuilt from scratch |
|
||
|
||
---
|
||
|
||
## 4. N8N Requirements (MK7)
|
||
|
||
### 4.1 Container Mounts
|
||
- **SSH client:** `openssh-client` package installed in N8N image
|
||
- **Private key:** Mount `~/.ssh/artemis_key` → `/root/.ssh/id_ed25519` inside N8N container
|
||
- **Known hosts:** Pre-populated `~/.ssh/known_hosts` for `artemis.ai.home`
|
||
|
||
### 4.2 N8N Credentials
|
||
- **SSH Private Key:** Store `artemis_key` in N8N "Credentials" → SSH type
|
||
- **SSH Host:** `artemis.ai.home` (or LAN IP `192.168.15.182`)
|
||
- **SSH User:** `jarvis`
|
||
- **SSH Port:** `22`
|
||
|
||
### 4.3 Security Constraints
|
||
- N8N connects **to Artemis only** — never to PVE nodes, Neo, or LXCs directly
|
||
- N8N never sees PVE API tokens or sudo passwords
|
||
- All terraform/ansible state stays on Artemis filesystem (not in N8N container)
|
||
|
||
---
|
||
|
||
## 5. Artemis Prerequisites (Already Exists)
|
||
|
||
| Component | Path | Status |
|
||
|-----------|------|--------|
|
||
| Terraform container | `~/docker/terraform-pve/` | ✅ Validated Phase 2 |
|
||
| Ansible container | `~/docker/ansible-push/` | ✅ Validated |
|
||
| Run script | `./run.sh` | ✅ Forwards TF_VAR_*, supports init/plan/apply/destroy |
|
||
| LXC provision script | `./lxc-common.sh` | ✅ Runs lxc_common role |
|
||
| Inventory template | `terraform/inventory-lxc.tmpl` | ✅ Auto-generates ansible_host |
|
||
|
||
---
|
||
|
||
## 6. Error Handling
|
||
|
||
| Scenario | N8N Action |
|
||
|----------|------------|
|
||
| Terraform apply fails | Abort, notify with stderr |
|
||
| Inventory not generated after apply | Retry once, then fail |
|
||
| Ansible unreachable | Report per-host, continue others |
|
||
| SSH connection refused | Retry 3× with backoff, then alert |
|
||
|
||
---
|
||
|
||
## 7. Open Questions
|
||
|
||
1. **Should `/build` auto-increment `vmid_base` from last used, or always require explicit input?**
|
||
2. **Should N8N trigger a Gitea commit of generated inventory after apply?**
|
||
3. **Should `/fleet-update` include PVE nodes via `apt` (not `dist-upgrade`) differently?**
|
||
4. **N8N webhook URL exposed via Tailscale or local LAN only?**
|
||
|
||
---
|
||
|
||
## 8. Decision Points
|
||
|
||
| Decision | Options | Recommended |
|
||
|----------|---------|-------------|
|
||
| N8N SSH key | `artemis_key` vs dedicated `n8n_key` | `artemis_key` for POC; rotate to dedicated key later |
|
||
| Notification target | Telegram vs Discord vs both | Both via existing gateway webhook |
|
||
| vmid_base tracking | Manual vs auto-increment vs check-before-apply | Manual for POC; auto-track in Phase 4 |
|
||
| Fleet-update schedule | On-demand only vs weekly cron | On-demand only via `/fleet-update` |
|