From fa7a6a2669807b317226e0771cb02db2c7a7d095 Mon Sep 17 00:00:00 2001 From: "F.R.I.D.A.Y." Date: Tue, 2 Jun 2026 06:31:15 -0400 Subject: [PATCH] PRD Updates: Fix MK7/Neo references; add Atlantis section; new Ansible Web UI comparison PRD --- .../ansible-automation-webui-comparison.md | 316 ++++++++++++++++++ .../terraform-proxmox-lxc-automation.md | 88 ++++- 2 files changed, 396 insertions(+), 8 deletions(-) create mode 100644 PRD Drafts/ansible-automation-webui-comparison.md diff --git a/PRD Drafts/ansible-automation-webui-comparison.md b/PRD Drafts/ansible-automation-webui-comparison.md new file mode 100644 index 0000000..2c85151 --- /dev/null +++ b/PRD Drafts/ansible-automation-webui-comparison.md @@ -0,0 +1,316 @@ +# Ansible Automation Web UI Comparison PRD + +**Status:** Draft | **Author:** F.R.I.D.A.Y. (Hermes Agent) | **Date:** 2026-06-02 + +--- + +## 1. Purpose & Scope + +This PRD evaluates web-based UIs for running and managing Ansible playbooks in the Iron Legion fleet. The focus is on self-hosted, Docker-friendly solutions that integrate with our existing Gitea SCM and are deployable on Swarm or standalone nodes. + +**Tools Evaluated:** +1. Semaphore UI (Ansible-native) — RECOMMENDED +2. Kestra (Generic orchestration, Ansible-compatible) +3. AWX (Official Red Hat Ansible platform) +4. Rundeck (Ops automation with Ansible plugin) +5. Jenkins + Ansible Plugin (CI/CD generalist) + +--- + +## 2. Requirements + +**Must-Have:** +- [x] Docker Compose or Swarm deployable +- [x] Ansible playbook execution (not just shell scripts calling ansible) +- [x] Web UI for triggering runs, viewing logs, managing inventories +- [x] Self-hosted (no cloud dependency) +- [x] Works on Iron Legion architecture (x86_64, moderate RAM) + +**Nice-to-Have:** +- [ ] Gitea webhook integration (auto-trigger on push) +- [ ] RBAC / multi-user access +- [ ] API for automation +- [ ] Scheduled runs (cron-like) +- [ ] Low resource footprint (fit on G9 nodes) + +--- + +## 3. Comparison Matrix + +| Criterion | Semaphore UI | Kestra | AWX | Rundeck | Jenkins + Ansible | +|-----------|-------------|--------|-----|---------|-------------------| +| **Primary Purpose** | Ansible-native runner | Generic workflow engine | Enterprise Ansible platform | Ops automation | CI/CD generalist | +| **Docker Compose** | ✅ Simple | ✅ Simple | ⚠️ Complex (K8s preferred) | ✅ Simple | ✅ Simple | +| **RAM Needed** | ~256 MB | ~512 MB | ~4 GB (6+ GB recommended) | ~512 MB | ~1 GB | +| **Ansible Integration** | Native | Via shell/HTTP tasks | Native | Plugin-based | Plugin-based | +| **Inventory Management** | Built-in (static + dynamic) | Via external files | Advanced (sources, scripts) | Basic | Via files/plugins | +| **Gitea Webhooks** | ✅ Supported | ✅ Supported | ⚠️ Requires AWX project sync | ✅ Via plugin | ✅ Via SCM polling | +| **RBAC / Multi-user** | ✅ | ✅ | ✅ Enterprise-grade | ✅ | ✅ Plugin-based | +| **Scheduled Runs** | ✅ Cron UI | ✅ Triggers | ✅ Schedules | ✅ Jobs scheduler | ✅ Cron trigger plugin | +| **Log Viewer** | ✅ Real-time | ✅ Real-time | ✅ Real-time + facts | ✅ | ✅ Plugin-dependent | +| **Vault Integration** | ✅ Key store built-in | Via secrets | ✅ Native | Via plugins | Via plugins | +| **Complexity** | Low | Medium | High | Medium | High | + +--- + +## 4. Tool Deep-Dives + +### 4.1 Semaphore UI (RECOMMENDED) + +**Why it wins:** Purpose-built for Ansible, minimal footprint, fast UI, and fits Iron Legion constraints. + +**Docker Compose:** + +```yaml +services: + mysql: + image: mysql:8.0 + environment: + MYSQL_ROOT_PASSWORD: semaphore-db-password + MYSQL_DATABASE: semaphore + MYSQL_USER: semaphore + MYSQL_PASSWORD: semaphore-db-password + volumes: + - semaphore-mysql:/var/lib/mysql + restart: unless-stopped + + semaphore: + image: semaphoreui/semaphore:latest + ports: + - "3000:3000" + environment: + SEMAPHORE_DB_DIALECT: mysql + SEMAPHORE_DB_HOST: mysql + SEMAPHORE_DB_NAME: semaphore + SEMAPHORE_DB_USER: semaphore + SEMAPHORE_DB_PASS: semaphore-db-password + SEMAPHORE_ADMIN_PASSWORD: admin-password + SEMAPHORE_ADMIN_NAME: admin + SEMAPHORE_ADMIN_EMAIL: admin@localhost + SEMAPHORE_ADMIN: admin + # Optional: Telegram / Slack / Gitea integration + SEMAPHORE_WEBHOOK: "1" + volumes: + - semaphore-config:/etc/semaphore + - /path/to/ansible/playbooks:/playbooks:ro + - /path/to/inventories:/inventories:ro + - /path/to/ssh/keys:/ssh:ro + depends_on: + - mysql + restart: unless-stopped + +volumes: + semaphore-mysql: + driver: local + semaphore-config: + driver: local +``` + +**Key Features:** +- **Project-centric:** Organize playbooks into projects with separate inventories, env vars, and access +- **Task Templates:** Define reusable job definitions with variables and surveys +- **Key Store:** Built-in encrypted vault for SSH keys, passwords, Ansible vault passwords +- **Cron Schedules:** UI-driven scheduling without crontab +- **Real-time Logs:** WebSocket-based live log streaming +- **Gitea Integration:** Add a Gitea repository as a project, clone on each run, webhooks for auto-trigger + +**Resource Footprint:** +- MySQL: ~200 MB RAM +- Semaphore: ~50–100 MB RAM +- Total: **~300 MB** — deployable on any G9 worker node + +**Cons:** +- Smaller community than AWX/Jenkins +- Less granular RBAC than AWX +- No built-in credential plugins (e.g., HashiCorp Vault) — must use env vars or files + +--- + +### 4.2 Kestra + +**What it is:** Language-agnostic workflow orchestration platform with a visual DAG editor. Not Ansible-specific, but can invoke Ansible via `io.kestra.plugin.scripts.shell.Commands` or `io.kestra.plugin.core.http.Request`. + +**Docker Compose:** + +```yaml +volumes: + postgres-data: + driver: local + kestra-data: + driver: local + +services: + postgres: + image: postgres:18 + volumes: + - postgres-data:/var/lib/postgresql + environment: + POSTGRES_DB: kestra + POSTGRES_USER: kestra + POSTGRES_PASSWORD: k3str4 + + kestra: + image: kestra/kestra:latest + user: "root" + command: server standalone + volumes: + - kestra-data:/app/storage + - /var/run/docker.sock:/var/run/docker.sock + - /tmp/kestra-wd:/tmp/kestra-wd + - /path/to/ansible:/ansible:ro + environment: + KESTRA_CONFIGURATION: | + datasources: + postgres: + url: jdbc:postgresql://postgres:5432/kestra + password: k3str4 + repository: + type: postgres + storage: + type: local + local: + base-path: "/app/storage" + queue: + type: postgres + url: http://localhost:8080/ + ports: + - "8080:8080" + depends_on: + - postgres +``` + +**Key Features:** +- **Visual DAG Editor:** Drag-and-drop workflow construction +- **Rich Triggers:** Schedule, webhook, event-driven (Kafka, S3, HTTP) +- **Plugin Ecosystem:** 400+ plugins (not Ansible-native — invoke via shell) +- **Scalability:** Built for large-scale data pipelines; may be overkill for fleet Ansible + +**Resource Footprint:** +- PostgreSQL: ~300 MB RAM +- Kestra: ~512 MB–1 GB RAM +- Total: **~1 GB** — heavier than Semaphore + +**Verdict for Iron Legion:** Powerful but misaligned. We need Ansible-native execution, not generic workflow orchestration. Use Kestra for data/ETL pipelines, not playbook management. + +--- + +### 4.3 AWX + +**What it is:** The upstream open-source project behind Ansible Automation Platform (formerly Ansible Tower). Full-featured enterprise Ansible management. + +**Key Features:** +- **Projects:** Link to Git repos (Gitea supported), auto-sync on push +- **Inventories:** Static, dynamic (custom scripts, cloud providers), smart inventories +- **Job Templates:** Parameterized with surveys, credentials, and RBAC +- **Workflows:** Chain multiple job templates into visual pipelines +- **RBAC:** Teams, organizations, user roles — most granular of all options +- **Notifications:** Email, Slack, webhook on job success/failure + +**Deployment:** +- Docker Compose exists but is officially a **development** target; production requires Kubernetes +- Requires Redis, PostgreSQL, memcached, and multiple AWX services +- Total RAM: **4–6 GB minimum** + +**Verdict for Iron Legion:** Overkill. Our fleet nodes (G9: ~11 GB RAM) could run AWX, but it would consume half a node's capacity. G9 nodes are better used as PVE workers with LXCs. AWX belongs on a dedicated management VM or MK7 if hardware permits. + +--- + +### 4.4 Rundeck + +**What it is:** Open-source operations automation platform with an Ansible plugin. + +**Docker Compose:** Simple single-container deployment with external database. + +**Key Features:** +- **Job Definitions:** YAML or XML, supports Ansible ad-hoc and playbook execution +- **Node Inventory:** Static or dynamic via Ansible inventory scripts +- **ACL Policies:** File-based RBAC +- **Scheduled Executions:** Built-in scheduler +- **Plugin Architecture:** Ansible, Slack, HTTP webhooks + +**Resource Footprint:** +- Rundeck: ~512 MB RAM +- MySQL/PostgreSQL: ~200–300 MB +- Total: **~700–800 MB** + +**Verdict for Iron Legion:** Viable middle-ground. Better than Jenkins for Ansible, but Semaphore is purpose-built and lighter. Rundeck's strength is multi-tool orchestration (Ansible + scripts + HTTP APIs), which we don't need yet. + +--- + +### 4.5 Jenkins + Ansible Plugin + +**What it is:** General-purpose CI/CD platform with Ansible integration via plugins. + +**Docker Compose:** + +```yaml +services: + jenkins: + image: jenkins/jenkins:lts + ports: + - "8080:8080" + - "50000:50000" + volumes: + - jenkins-data:/var/jenkins_home + - /path/to/ansible/playbooks:/playbooks:ro + - /path/to/inventories:/inventories:ro + restart: unless-stopped + +volumes: + jenkins-data: + driver: local +``` + +**Key Features:** +- **Pipelines:** Groovy-based Jenkinsfile pipelines for Ansible execution +- **Blue Ocean:** Modern UI for pipeline visualization +- **Plugin Ecosystem:** Massive library (Ansible, Slack, Git, Gitea) +- **Distributed Builds:** Agent nodes for parallel playbook runs + +**Resource Footprint:** +- Jenkins: ~1 GB RAM (grows with plugin load) +- Optional agents: variable +- Total: **~1–2 GB** + +**Verdict for Iron Legion:** Wrong tool for the job. Jenkins excels at CI/CD pipelines (build → test → deploy), not at day-to-day Ansible playbook management. The UI is pipeline-centric, not inventory- or template-centric. Use Jenkins for software CI/CD, not fleet automation. + +--- + +## 5. Recommendation + +| Use Case | Recommended Tool | +|----------|---------------| +| **Primary Ansible playbook runner** | **Semaphore UI** | +| Complex enterprise RBAC + workflows | AWX (on dedicated VM) | +| Generic workflow orchestration (not Ansible-specific) | Kestra | +| Multi-tool ops automation (Ansible + scripts + APIs) | Rundeck | +| Software CI/CD pipelines | Jenkins | + +**Iron Legion Path Forward:** +1. **Deploy Semaphore UI** on MK7 Swarm or a lightweight LXC on MK33 +2. Create a Project pointing to `Iron-Legion/ansible-playbooks` on Gitea +3. Configure inventories, task templates, and schedules +4. Add Gitea webhook to auto-trigger Semaphore tasks on push to `main` +5. **Optional:** Evaluate AWX later if RBAC/complexity demands grow — deploy on a dedicated management LXC with 4 GB RAM reservation + +--- + +## 6. Open Questions + +1. **Should Semaphore run as a standalone Docker Compose stack or as a Swarm service?** + - Standalone: simpler, survives Swarm reconfiguration + - Swarm: automatic placement, Traefik ingress, less manual maintenance + +2. **Where does the Ansible inventory live?** + - Option A: In the Gitea repo alongside playbooks (version-controlled) + - Option B: Static files on the Semaphore host (faster Semaphore startup) + - Option C: Dynamic inventory script pulling from Technitium DNS/PVE API + +3. **Gitea webhook reachability:** + - Gitea on Neo (`192.168.192.24`) → Semaphore on MK7 or G9 node + - Must ensure Semaphore endpoint is reachable from Neo (LAN routing) + - Can use Tailscale as fallback + +--- + +*End of PRD — Iron Legion Labs* diff --git a/PRD Drafts/terraform-proxmox-lxc-automation.md b/PRD Drafts/terraform-proxmox-lxc-automation.md index 4f91b5f..5412cae 100644 --- a/PRD Drafts/terraform-proxmox-lxc-automation.md +++ b/PRD Drafts/terraform-proxmox-lxc-automation.md @@ -63,7 +63,7 @@ terraform { } provider "proxmox" { - endpoint = "https://192.168.7.7:8006/" + endpoint = "https://192.168.7.33:8006/" username = "root@pam" password = var.proxmox_password # or PROXMOX_VE_PASSWORD env var insecure = true # self-signed TLS @@ -156,13 +156,13 @@ terraform { variable "proxmox_endpoint" { description = "PVE API URL" type = string - default = "https://192.168.7.7:8006/" + default = "https://192.168.7.33:8006/" } variable "proxmox_node" { description = "Target PVE node name" type = string - default = "mk7" + default = "mk33" } variable "ssh_public_key" { @@ -332,7 +332,7 @@ output "lxc_passwords" { module "dev_lxcs" { source = "../../modules/lxc" - proxxmox_node = "mk7" + proxxmox_node = "mk33" ssh_public_key = file("~/.ssh/id_ed25519.pub") lxc_configs = { @@ -400,13 +400,13 @@ Use data sources to query existing infrastructure without managing it: ```hcl data "proxmox_virtual_environment_datastores" "available" { - node_name = "mk7" + node_name = var.proxmox_node } data "proxmox_virtual_environment_nodes" "cluster" {} data "proxmox_virtual_environment_container" "existing" { - node_name = "mk7" + node_name = var.proxmox_node # or specify target node explicitly vm_id = 2001 } ``` @@ -422,7 +422,7 @@ data "proxmox_virtual_environment_container" "existing" { ### Recommended: S3-Compatible Backend -Iron Legion already runs self-hosted services. A Garage or Minio instance on Neo/MK7 can serve as the Terraform state backend: +Iron Legion already runs self-hosted services. A Garage or Minio instance on a fleet storage node (e.g., Neo) can serve as the Terraform state backend: ```hcl terraform { @@ -447,6 +447,78 @@ Add a DynamoDB-compatible table or use a native locking mechanism. If S3 backend --- +## Optional: Atlantis Web UI for Terraform PR Automation + +### What Atlantis Is + +Atlantis is a self-hosted web application that listens for webhook events from Git repositories and runs `terraform plan` / `terraform apply` automatically inside PR/MR workflows. It posts plan output back to the PR as comments, enforces approval gates, and locks workspaces to prevent concurrent applies. + +### Can Atlantis Manage LXC Resources via `bpg/proxmox`? + +**Yes.** Atlantis is a Terraform orchestration layer, not a provider. It supports any Terraform provider including `bpg/proxmox`. The workflow is: +1. Developer opens a PR adding/modifying `.tf` files defining LXC containers +2. Atlantis receives the webhook and runs `terraform plan` in a isolated directory +3. Plan output posted as a PR comment — team reviews before approval +4. After approval (or `atlantis apply` comment), Atlantis runs `terraform apply` + +### Atlantis Docker Compose (Self-Hosted) + +```yaml +services: + atlantis: + image: ghcr.io/runatlantis/atlantis:latest + ports: + - "4141:4141" + volumes: + - ${HOME}/.ssh:/home/atlantis/.ssh:ro # Git SSH key + - /var/run/docker.sock:/var/run/docker.sock:ro # if using Docker TF provider + - atlantis-data:/home/atlantis/.atlantis + environment: + ATLANTIS_GH_USER: "iron-legion-bot" # or ATLANTIS_GITLAB_USER / ATLANTIS_GITEA_USER + ATLANTIS_GH_TOKEN: "${ATLANTIS_GH_TOKEN}" # personal access token + ATLANTIS_REPO_ALLOWLIST: "github.com/Iron-Legion/*" + ATLANTIS_GH_WEBHOOK_SECRET: "${WEBHOOK_SECRET}" + # For Gitea: + # ATLANTIS_GITEA_USER: "iron-legion-bot" + # ATLANTIS_GITEA_TOKEN: "${GITEA_TOKEN}" + # ATLANTIS_GITEA_WEBHOOK_SECRET: "${WEBHOOK_SECRET}" + command: server + restart: unless-stopped + + # Optional: Redis for distributed locking in multi-replica setups + # redis: + # image: redis:8-alpine + # volumes: + # - redis-data:/data + # restart: always + +volumes: + atlantis-data: + driver: local +``` + +### Key Features + +- **Plan Comments:** Every PR gets an auto-generated `terraform plan` comment +- **Apply Locking:** One apply at a time per workspace; concurrent PRs queue +- **Policy Checks:** Integrate OPA (Open Policy Agent) or custom scripts to block non-compliant changes +- **Custom Workflows:** Define per-repo or per-directory workflows (e.g., plan-only for dev, auto-apply for staging) +- **Self-Hosted SCM:** Native webhook support for GitHub, GitLab, Bitbucket, **and Gitea** + +### Resource Footprint + +- Atlantis container: ~100–200 MB RAM, minimal CPU +- Optional Redis: ~20 MB RAM +- Total: fits comfortably on any Iron Legion node (MK7, MK33–42, Neo) + +### Gitea Integration Notes + +- Atlantis supports Gitea via the `--gitea-user`, `--gitea-token`, `--gitea-webhook-secret` flags +- Must expose Atlantis endpoint to Gitea (Tailscale funnel, reverse proxy, or LAN if Gitea is in-network) +- Webhook URL: `http://atlantis-host:4141/events` + +--- + ## 9. Operational Workflow ### Day 0 — Bootstrap @@ -509,7 +581,7 @@ terraform apply tfplan - Terraform can call a `dns_a_record` module (if Technitium provider exists) - Or: use PVE's built-in DHCP + DNSMASQ if configured -3. **CI/CD pipeline: GitHub Actions runner, or local Gitea Actions on Neo?** +3. **CI/CD pipeline: GitHub Actions runner, or local Gitea Actions on the fleet SCM host?** - Gitea Actions keeps secrets in-network - GitHub Actions requires Tailscale funnel or external exposure