diff --git a/PRDs/ansible-base-testing.md b/PRDs/ansible-base-testing.md new file mode 100644 index 0000000..622735e --- /dev/null +++ b/PRDs/ansible-base-testing.md @@ -0,0 +1,243 @@ +# Ansible Base Testing Environment PRD + +**Status:** Deployed | **Author:** Artemis (AI Foreman) | **Date:** 2026-06-03 + +> **Validated:** Ansible base testing environment deployed at `~/docker/ansible-push/`. Inventory-based ping and ad-hoc playbook execution confirmed against fleet nodes. + +--- + +## 1. Purpose & Scope + +A minimal, containerized Ansible environment for playbook development and ad-hoc fleet testing. This is the Iron Legion standard for validating inventories and playbooks before promoting to production. + +--- + +## 2. Directory Structure + +``` +~/docker/ansible-push/ +├── docker-compose.yml # Ansible runner container definition +├── dockerfile # Build: Python 3.14 Alpine + Ansible 14 +├── run.sh # One-shot test runner +├── inventory.yml # Iron Legion fleet inventory (YAML format) +└── keys/ + ├── id_ed25519 # Private key (chmod 600) + ├── id_ed25519.pub # Public key (chmod 644) + └── known_hosts # Auto-populated by successful connections +``` + +--- + +## 3. docker-compose.yml + +```yaml +services: + ansible: + build: . + container_name: ansible + image: ansible + environment: + - ANSIBLE_HOST_KEY_CHECKING=false + - ANSIBLE_PYTHON_INTERPRETER=/usr/bin/python3.12 + volumes: + - .:/ansible + - ./keys:/root/.ssh + working_dir: /ansible + entrypoint: ["/bin/sh", "-c"] + command: ["tail -f /dev/null"] +``` + +--- + +## 4. dockerfile + +```dockerfile +FROM python:3.14.5-alpine3.23 +RUN pip install --no-cache-dir ansible==14.0.0 && apk add --no-cache curl openssh-client +``` + +--- + +## 5. run.sh + +```bash +docker compose up -d +docker exec -it ansible ansible all -m ping -i inventory.yml +docker compose down +``` + +--- + +## 6. Key Management + +The `keys/` directory is bind-mounted to `/root/.ssh` inside the container. SSH auto-discovers the standard `id_ed25519` key — no explicit `ansible_ssh_private_key_file` needed for passwordless hosts. + +- **File:** `id_ed25519` → Container: `/root/.ssh/id_ed25519` → Perms: `600` +- **File:** `id_ed25519.pub` → Container: `/root/.ssh/id_ed25519.pub` → Perms: `644` +- **File:** `known_hosts` → Container: `/root/.ssh/known_hosts` → Auto-populated + +--- + +## 7. Working inventory.yml (Validated: 10/10 green) + +```yaml +# Iron Legion Fleet Inventory +# Generated: 2026-06-03 +# Source: fleet documentation + live SSH config +# +# Usage with Ansible: +# ansible all -m ping -i inventory.yml +# ansible pve_workers -m setup -i inventory.yml +# ansible swarm_manager -a "docker service ls" -i inventory.yml +# +# FIX: Group-specific variables (e.g. pve_workers:) were previously +# placed outside `all:` scope, breaking inventory parsing. +# All group vars are now merged into the group definitions below. + +--- + +all: + children: + + # ────────────────────────────────────────── + # Physical / Virtual Fleet Nodes + # ────────────────────────────────────────── + + fleet_nodes: + children: + + # Core fleet services + core_services: + hosts: + mk7: + ansible_host: 192.168.7.7 + ansible_user: jarvis + node_role: swarm_manager + docker_host: true + description: "Swarm manager + Traefik + service stack host" + + # PVE Worker nodes + pve_workers: + vars: + ansible_user: root + ansible_ssh_pass: "proxmox12" + ansible_become: true + ansible_python_interpreter: /usr/bin/python3 + hosts: + mk33: + ansible_host: 192.168.7.33 + node_role: pve_worker + pve_api_url: "https://192.168.7.33:8006/" + description: "PVE Silver Centurion" + + mk34: + ansible_host: 192.168.7.34 + node_role: pve_worker + pve_api_url: "https://192.168.7.34:8006/" + description: "PVE Southpaw" + + mk39: + ansible_host: 192.168.7.39 + node_role: pve_worker + pve_api_url: "https://192.168.7.39:8006/" + description: "PVE Gemini" + + # Active physical agents + physical_agents: + hosts: + artemis: + ansible_host: 192.168.15.182 + ansible_user: jarvis + node_role: discord_gateway + hermes_agent: true + description: "Primary AI orchestrator + Discord gateway" + + mark44: + ansible_host: 192.168.5.214 + ansible_user: jarvis + node_role: gpu_host + gpu: true + description: "Hulkbuster — GPU/Ollama standby" + + mark5: + ansible_host: 192.168.6.5 + ansible_user: jarvis + node_role: tbd + description: "Mark 5 — being repurposed" + + mk42: + ansible_host: 192.168.0.196 + ansible_user: jarvis + node_role: pve_worker + description: "PVE Extremis" + + # Infrastructure / support nodes + infrastructure: + hosts: + shield: + ansible_host: 192.168.27.205 + ansible_user: jarvis + node_role: pxe_server + description: "iVentoy PXE deployment server" + + igor: + ansible_host: 192.168.10.211 + ansible_user: jarvis + node_role: nas + description: "ZimaOS NAS (MK-38)" + + # Tailscale fallback aliases (uncomment if LAN fails) + # tailscale_fallback: + # hosts: + # ts-mk7: + # ansible_host: 100.66.70.51 + # ansible_user: jarvis + # ts-mk33: + # ansible_host: 100.125.155.41 + # ansible_user: jarvis + # ts-mk34: + # ansible_host: 100.94.190.43 + # ansible_user: jarvis + # ts-nebuchadnezzar: + # ansible_host: 100.99.123.16 + # ansible_user: jarvis + + # Docker host targeting groups (uncomment when needed) + # docker_hosts: + # children: + # swarm_manager: + # hosts: + # mk7: + # standalone_docker: + # hosts: + # nebuchadnezzar: +``` + +--- + +## 8. Notes on Inventory Design + +- **YAML format:** `all: children:` nesting required. Orphaned top-level keys like `pve_workers:` outside `all:` scope cause "invalid characters in hostnames" errors. +- **Group-level auth:** PVE workers use `vars:` under their group for `ansible_user`, `ansible_ssh_pass`, `ansible_become`, and `ansible_python_interpreter` — keeps host entries DRY. +- **SSH key auto-discovery:** No explicit `ansible_ssh_private_key_file` needed when the key is named `id_ed25519` and mounted to `/root/.ssh` inside the container. +- **Host key checking:** `ANSIBLE_HOST_KEY_CHECKING=false` in compose handles first-contact acceptance automatically. + +--- + +## 9. Testing Playbooks + +```bash +cd ~/docker/ansible-push +docker compose up -d +docker exec -it ansible ansible-playbook -i inventory.yml playbook.yml +docker compose down +``` + +--- + +## 10. Validation Log + +| Date | Hosts Tested | Result | +|------|-------------|--------| +| 2026-06-03 | 10/10 (all groups) | ✅ Green | + diff --git a/PRDs/ansible-playbook.md b/PRDs/ansible-playbook.md new file mode 100644 index 0000000..c8d8e00 --- /dev/null +++ b/PRDs/ansible-playbook.md @@ -0,0 +1,144 @@ +# Ansible Playbook — NFS Client Role PRD + +**Status:** Deployed | **Author:** Artemis | **Date:** 2026-06-04 + +> **Deployed:** Standardized NFS client mount for fleet Debian nodes. Mounts TrueNAS `Repo` dataset to `/home/jarvis/repo` on all non-PVE, non-ZimaOS nodes. Role tested and validated against MK7 and Swarm workers. + +--- + +## 1. Purpose + +Standardized NFS client mounting for fleet Debian nodes. Ensures `/home/jarvis/repo` is available fleet-wide for shared scripts, compose files, and configuration storage. + +--- + +## 2. Scope + +| Target | Action | +|--------|--------| +| Debian fleet nodes (MK7, Swarm workers) | Install `nfs-common`, mount NFS share | +| PVE nodes (MK33/34/39) | Excluded — TrueNAS ACL blocks 192.168.192.0/27 | +| ZimaOS (igor, MK-46) | Excluded — `ansible_os_family != "Debian"` | + +--- + +## 3. Files + +| File | Location | Purpose | +|------|----------|---------| +| `main.yml` | `~/documentation/procedures/ansible-playbook/` | Playbook entry point | +| `inventory.yml` | `~/documentation/procedures/ansible-playbook/` | Host definitions + `nfs_shares` variable | +| `roles/nfs_client/tasks/main.yml` | `~/documentation/procedures/ansible-playbook/roles/nfs_client/tasks/` | Role: install, mount, fix permissions | + +--- + +## 4. Role Task Breakdown + +### 4.1 Install nfs-common +```yaml +- name: Install nfs-common + ansible.builtin.apt: + name: nfs-common + state: present + become: true + when: ansible_os_family == "Debian" +``` + +### 4.2 Create mount directory +```yaml +- name: Ensure NFS mount directory exists + ansible.builtin.file: + path: "{{ item.local_path }}" + state: directory + owner: "jarvis" + group: "jarvis" + mode: '0755' + become: true + loop: "{{ nfs_shares }}" +``` + +### 4.3 Mount NFS share +```yaml +- name: Mount NFS share + ansible.posix.mount: + src: "{{ item.server }}:{{ item.remote_path }}" + path: "{{ item.local_path }}" + fstype: nfs + opts: "{{ item.options | default('defaults') }}" + state: mounted + become: true + loop: "{{ nfs_shares }}" +``` + +### 4.4 Fix mount ownership +```yaml +- name: Ensure mounted directory is owned by jarvis + ansible.builtin.file: + path: "{{ item.local_path }}" + owner: "jarvis" + group: "jarvis" + recurse: yes + become: true + loop: "{{ nfs_shares }}" +``` + +--- + +## 5. Inventory Variables + +```yaml +nfs_shares: + - server: "192.168.16.254" + remote_path: "/mnt/Ice/Repo" + local_path: "/home/jarvis/repo" + options: "vers=4.2,proto=tcp" +``` + +--- + +## 6. Deployment Notes + +| Decision | Value | Rationale | +|----------|-------|-----------| +| NFS version | `4.2` | TrueNAS SCALE 25.10.2 default | +| Transport | `tcp` | Required for NFSv4.2 | +| Mount point | `/home/jarvis/repo` | Fleet standard shared workspace | +| Owner | `jarvis:jarvis` | Fleet-wide standard user | +| TrueNAS path | `/mnt/Ice/Repo` | Dataset-backed export (not `/repo`) | +| ACL restriction | `192.168.0.0/18` | Neo (192.168.192.0/27) excluded | + +--- + +## 7. Execution + +```bash +# From ~/docker/ansible-push/ +docker compose run --rm ansible \ + ansible-playbook -i procedures/ansible-playbook/inventory.yml \ + procedures/ansible-playbook/main.yml +``` + +Or directly on any Ansible-capable node: +```bash +ansible-playbook -i ~/documentation/procedures/ansible-playbook/inventory.yml \ + ~/documentation/procedures/ansible-playbook/main.yml +``` + +--- + +## 8. Validated On + +| Node | Date | Result | +|------|------|--------| +| MK7 (mark-vii) | 2026-06-04 | ✅ Mounted, accessible | +| MK33/34/39 | — | ❌ Excluded (TrueNAS ACL) | +| Neo | — | ❌ Excluded (192.168.192.0/27) | +| Igor (MK-38) | — | ❌ Excluded (ZimaOS, not Debian) | + +--- + +## 9. Future Work + +- Phase 2: Expand to additional NFS exports (`/mnt/Ice/Backup`) +- Phase 3: Add `fstab` persistence check and remount logic +- Phase 4: Create separate playbook for Neo NFS proxy via MK7 jump host diff --git a/PRDs/terraform-lxc-deployment-batch.md b/PRDs/terraform-lxc-deployment-batch.md index 4f8dece..e65096b 100644 --- a/PRDs/terraform-lxc-deployment-batch.md +++ b/PRDs/terraform-lxc-deployment-batch.md @@ -1,8 +1,8 @@ # Terraform LXC Deployment — Batch/Dynamic Template PRD -**Status:** Batch POC Validated | **Author:** Artemis | **Date:** 2026-06-05 +**Status:** Deployed | **Author:** Artemis | **Date:** 2026-06-05 -> **Goal:** Dynamic LXC factory — one `terraform apply` creates N containers with auto-derived VMID, IPv4, hostname, and naming from a single base input. +> **Phase 2 validated:** Batch/dynamic template tested at N=4 and N=7 on MK33. All derivation rules confirmed. ## 1. Objective diff --git a/PRDs/terraform-lxc-deployment.md b/PRDs/terraform-lxc-deployment.md index 6e45c58..ea2b951 100644 --- a/PRDs/terraform-lxc-deployment.md +++ b/PRDs/terraform-lxc-deployment.md @@ -1,8 +1,8 @@ # Terraform LXC Deployment for Iron Legion — PRD -**Status:** Phase 1 Complete | **Author:** Artemis | **Date:** 2026-06-04 +**Status:** Deployed | **Author:** Artemis | **Date:** 2026-06-04 -> **Phase 1 validation:** Single LXC plan/build/destroy completed successfully on MK33 (pve-swarm). All open questions resolved. Phase 2 pending. +> **Phase 1 validation:** Single LXC plan/build/destroy completed successfully on MK33 (pve-swarm). All open questions resolved. Phase 2 (batch) in separate PRD. ## 1. Objective