Files
documentation/PRD Drafts/ansible-base-testing.md

7.0 KiB

Ansible Base Testing Environment PRD

Status: Draft | Author: Artemis (AI Foreman) | Date: 2026-06-03


1. Purpose & Scope

A minimal, containerized Ansible environment for playbook development and ad-hoc fleet testing. This is the Iron Legion standard for validating inventories and playbooks before promoting to production.


2. Directory Structure

~/docker/ansible-push/
├── docker-compose.yml    # Ansible runner container definition
├── dockerfile            # Build: Python 3.14 Alpine + Ansible 14
├── run.sh                # One-shot test runner
├── inventory.yml         # Iron Legion fleet inventory (YAML format)
└── keys/
    ├── id_ed25519        # Private key (chmod 600)
    ├── id_ed25519.pub    # Public key (chmod 644)
    └── known_hosts       # Auto-populated by successful connections

3. docker-compose.yml

services:
  ansible:
    build: .
    container_name: ansible
    image: ansible
    environment:
      - ANSIBLE_HOST_KEY_CHECKING=false
      - ANSIBLE_PYTHON_INTERPRETER=/usr/bin/python3.12
    volumes:
      - .:/ansible
      - ./keys:/root/.ssh
    working_dir: /ansible
    entrypoint: ["/bin/sh", "-c"]
    command: ["tail -f /dev/null"]

4. dockerfile

FROM python:3.14.5-alpine3.23
RUN pip install --no-cache-dir ansible==14.0.0 && apk add --no-cache curl openssh-client

5. run.sh

docker compose up -d
docker exec -it ansible ansible all -m ping -i inventory.yml
docker compose down

6. Key Management

The keys/ directory is bind-mounted to /root/.ssh inside the container. SSH auto-discovers the standard id_ed25519 key — no explicit ansible_ssh_private_key_file needed for passwordless hosts.

  • File: id_ed25519 → Container: /root/.ssh/id_ed25519 → Perms: 600
  • File: id_ed25519.pub → Container: /root/.ssh/id_ed25519.pub → Perms: 644
  • File: known_hosts → Container: /root/.ssh/known_hosts → Auto-populated

7. Working inventory.yml (Validated: 10/10 green)

# Iron Legion Fleet Inventory
# Generated: 2026-06-03
# Source: fleet documentation + live SSH config
#
# Usage with Ansible:
#   ansible all -m ping -i inventory.yml
#   ansible pve_workers -m setup -i inventory.yml
#   ansible swarm_manager -a "docker service ls" -i inventory.yml
#
# FIX: Group-specific variables (e.g. pve_workers:) were previously
# placed outside `all:` scope, breaking inventory parsing.
# All group vars are now merged into the group definitions below.

---

all:
  children:

    # ──────────────────────────────────────────
    # Physical / Virtual Fleet Nodes
    # ──────────────────────────────────────────

    fleet_nodes:
      children:

        # Core fleet services
        core_services:
          hosts:
            mk7:
              ansible_host: 192.168.7.7
              ansible_user: jarvis
              node_role: swarm_manager
              docker_host: true
              description: "Swarm manager + Traefik + service stack host"

        # PVE Worker nodes
        pve_workers:
          vars:
            ansible_user: root
            ansible_ssh_pass: "proxmox12"
            ansible_become: true
            ansible_python_interpreter: /usr/bin/python3
          hosts:
            mk33:
              ansible_host: 192.168.7.33
              node_role: pve_worker
              pve_api_url: "https://192.168.7.33:8006/"
              description: "PVE Silver Centurion"

            mk34:
              ansible_host: 192.168.7.34
              node_role: pve_worker
              pve_api_url: "https://192.168.7.34:8006/"
              description: "PVE Southpaw"

            mk39:
              ansible_host: 192.168.7.39
              node_role: pve_worker
              pve_api_url: "https://192.168.7.39:8006/"
              description: "PVE Gemini"

        # Active physical agents
        physical_agents:
          hosts:
            artemis:
              ansible_host: 192.168.15.182
              ansible_user: jarvis
              node_role: discord_gateway
              hermes_agent: true
              description: "Primary AI orchestrator + Discord gateway"

            mark44:
              ansible_host: 192.168.5.214
              ansible_user: jarvis
              node_role: gpu_host
              gpu: true
              description: "Hulkbuster — GPU/Ollama standby"

            mark5:
              ansible_host: 192.168.6.5
              ansible_user: jarvis
              node_role: tbd
              description: "Mark 5 — being repurposed"

            mk42:
              ansible_host: 192.168.0.196
              ansible_user: jarvis
              node_role: pve_worker
              description: "PVE Extremis"

        # Infrastructure / support nodes
        infrastructure:
          hosts:
            shield:
              ansible_host: 192.168.27.205
              ansible_user: jarvis
              node_role: pxe_server
              description: "iVentoy PXE deployment server"

            igor:
              ansible_host: 192.168.10.211
              ansible_user: jarvis
              node_role: nas
              description: "ZimaOS NAS (MK-38)"

    # Tailscale fallback aliases (uncomment if LAN fails)
    # tailscale_fallback:
    #   hosts:
    #     ts-mk7:
    #       ansible_host: 100.66.70.51
    #       ansible_user: jarvis
    #     ts-mk33:
    #       ansible_host: 100.125.155.41
    #       ansible_user: jarvis
    #     ts-mk34:
    #       ansible_host: 100.94.190.43
    #       ansible_user: jarvis
    #     ts-nebuchadnezzar:
    #       ansible_host: 100.99.123.16
    #       ansible_user: jarvis

    # Docker host targeting groups (uncomment when needed)
    # docker_hosts:
    #   children:
    #     swarm_manager:
    #       hosts:
    #         mk7:
    #     standalone_docker:
    #       hosts:
    #         nebuchadnezzar:

8. Notes on Inventory Design

  • YAML format: all: children: nesting required. Orphaned top-level keys like pve_workers: outside all: scope cause "invalid characters in hostnames" errors.
  • Group-level auth: PVE workers use vars: under their group for ansible_user, ansible_ssh_pass, ansible_become, and ansible_python_interpreter — keeps host entries DRY.
  • SSH key auto-discovery: No explicit ansible_ssh_private_key_file needed when the key is named id_ed25519 and mounted to /root/.ssh inside the container.
  • Host key checking: ANSIBLE_HOST_KEY_CHECKING=false in compose handles first-contact acceptance automatically.

9. Testing Playbooks

cd ~/docker/ansible-push
docker compose up -d
docker exec -it ansible ansible-playbook -i inventory.yml playbook.yml
docker compose down

10. Validation Log

Date Hosts Tested Result
2026-06-03 10/10 (all groups) Green