diff --git a/03-constraints.md b/03-constraints.md index 00870cb..81a3716 100644 --- a/03-constraints.md +++ b/03-constraints.md @@ -12,7 +12,7 @@ | Node | Role | Services Assigned | |------|------|-------------------| -| **MK7 (mark-vii.ai.home)** | Swarm Manager | ALL Phase 1 infrastructure: Traefik, Technitium DNS, AdGuard Home, Portainer, Prometheus, Beszel, Dozzle, Authelia, Homepage | +| **MK7 (mark-vii.ai.home)** | Swarm Manager | ALL Phase 1 infrastructure: Traefik, Technitium DNS, Portainer, Prometheus, Beszel, Dozzle, Authelia, Homepage | | **MK33, MK34, MK39, MK42** | Swarm Workers | Phase 2 media stack (Jellyfin, Sonarr, Radarr, Prowlarr), distributed workloads, Vaultwarden, Nextcloud | | **Artemis** | AI Foreman / JARVIS | Hermes Agent, Ansible-pull control plane — NOT a service host | diff --git a/04-service-catalog.md b/04-service-catalog.md index 53e35d0..cc6acb3 100644 --- a/04-service-catalog.md +++ b/04-service-catalog.md @@ -21,8 +21,8 @@ | Service | Image | Pulls | Stars | Updated | Placement | Notes | |---------|-------|-------|-------|---------|-----------|-------| | **Traefik** | `traefik` | 3.49B | 3,634 | 2026-05-13 | **Global** | Every node receives ingress routing + Docker socket read-only | -| **Technitium DNS** | `technitium/dns-server` | 8.99M | 156 | 2026-05-09 | **Manager Constraint** | Single authoritative DNS — port 53 on MK7 only | -| **AdGuard Home** | `adguard/adguardhome` | 170.7M | 1,408 | 2026-05-25 | **Replicated (1)** | Single replica on MK7 — port 3000 | +| **Technitium DNS** | `technitium/dns-server` | 8.99M | 156 | 2026-05-09 | **Manager Constraint** | Authoritative `.ai.home` + recursive with DoT to Cloudflare, ad blocking — port 53 on MK7 only | +| **~~AdGuard Home~~** | ~~`adguard/adguardhome`~~ | ~~170.7M~~ | ~~1,408~~ | ~~2026-05-25~~ | ~~**Removed**~~ | ~~Technitium built-in ad blocking replaces AdGuard~~ | ### Monitoring / Observability | Service | Image | Pulls | Stars | Updated | Placement | Notes | diff --git a/05-network-architecture.md b/05-network-architecture.md index 2bedcb1..0facb77 100644 --- a/05-network-architecture.md +++ b/05-network-architecture.md @@ -27,7 +27,8 @@ |-----------|--------|--------| | **Technitium (MK7)** | ✅ Deployed | Container running, port 53/5380 open | | **`*.ai.home` zone** | ⏳ Pending | Not yet configured as authoritative — Tailscale MagicDNS currently handles name resolution | -| **AdGuard Home (MK7)** | ✅ Active | Recursive resolver + blocklists on port 3000. Replaces Pi-hole. | +| **Technitium DNS (MK7)** | ✅ Active | Authoritative `.ai.home` + recursive resolver + ad blocking on port 53. | +| **~~AdGuard Home~~** | ~~Removed~~ | ~~Technitium built-in ad blocking replaces AdGuard~~ | **Planned Chain (not yet active):** ``` diff --git a/06-data-and-persistence.md b/06-data-and-persistence.md index 8775f01..3339c94 100644 --- a/06-data-and-persistence.md +++ b/06-data-and-persistence.md @@ -17,7 +17,7 @@ Every service with persistent state uses **bind mounts to on-node directories**. |---------|-----------|---------------|---------------| | **Traefik** | `/opt/iron-legion/traefik/config/` `/opt/iron-legion/traefik/certs/` | MK7 (daily rsync) | < 50 MB | | **Technitium DNS** | `/opt/iron-legion/technitium/config/` | MK7 | < 10 MB | -| **AdGuard Home** | `/opt/iron-legion/adguard/work/` `/opt/iron-legion/adguard/conf/` | MK7 | < 500 MB | +| **~~AdGuard Home~~** | ~~`/opt/iron-legion/adguard/work/`~~ ~~`/opt/iron-legion/adguard/conf/`~~ | ~~Removed~~ | ~~N/A~~ | | **Prometheus** | `/opt/iron-legion/prometheus/data/` | MK7 (retention: 15d local, 90d backup) | 5–20 GB | | **Grafana** | `/opt/iron-legion/grafana/data/` | MK7 | < 500 MB | | **Beszel** | `/opt/iron-legion/beszel/data/` | MK7 | < 1 GB | diff --git a/08-deployment-phases.md b/08-deployment-phases.md index 26bbd8c..41efb71 100644 --- a/08-deployment-phases.md +++ b/08-deployment-phases.md @@ -6,7 +6,8 @@ | Order | Service | Target Node | Why First | Dependencies | |-------|---------|-------------|-----------|--------------| | 1 | **Technitium DNS** | MK7 | Name resolution for internal services | None | -| 2 | **AdGuard Home** | MK7 | Recursive DNS + ad-block | Technitium (via conditional forwarding) | +| 2 | **Technitium DNS** | MK7 | Authoritative + recursive + ad-block | N/A — single service | +| ~~AdGuard Home~~ | ~~Removed~~ | ~~Technitium replaces AdGuard~~ | | 3 | **Traefik** | MK7 | Edge router for all HTTP ingress | DNS (needs `*.labs.internal` to resolve) | | 4 | **Authelia** | MK7 | Auth layer before exposing any mgmt UI | Traefik (depends on ForwardAuth middleware) | | 5 | **Portainer** | MK7 | Container management UI | Traefik + Authelia (for secured access) | diff --git a/09-open-questions.md b/09-open-questions.md index 64f7ff1..8cb563e 100644 --- a/09-open-questions.md +++ b/09-open-questions.md @@ -4,8 +4,8 @@ | # | Question | Impact | Default if Unresolved | |---|----------|--------|----------------------| | 1 | **Domain name** — Does Bobby own a domain (e.g., `bobbysh.me`) or do we use a fake TLD (`labs.internal`)? | **Critical** — TLS certs, Authelia, and DNS all depend on this. | Use `labs.internal` + self-signed CA | -| 2 | **Technitium upstream** — DoH, DoT, or plain UDP to upstream resolver (e.g., Cloudflare 1.1.1.1)? | Low — can default to DoH | DoH → `https://cloudflare-dns.com/dns-query` | -| 3 | **AdGuard Home vs Technitium layout** — AdGuard runs on port 3000, Technitium on 53. No collision, but conditional forwarding from Technitium to AdGuard needs config. | Low — both run independently | Technitium uses upstream AdGuard for recursive queries | +|| 2 | **~~Technitium upstream~~** | ~~Low~~ | ~~Resolved. DoT to Cloudflare `tls://1.1.1.1`~~ | +|| 3 | **~~AdGuard Home vs Technitium layout~~** | ~~Low~~ | ~~**Resolved.** AdGuard removed. Technitium handles authoritative + recursive + ad blocking independently~~ | | 4 | **Jellyfin media storage** — External USB on MK7? SMB share? NVMe? | Medium | External USB mounted at `/media` on MK7 | | 5 | **Backup target on MK7** — Capacity? Dedicated drive? Rsync target path? | Medium | `/backups//` on MK7 secondary storage | | 6 | **Nextcloud database** — Use existing PostgreSQL on MK7, or deploy Nextcloud AIO (bundled)? | Medium — affects resource allocation on MK7 | Deploy standalone PostgreSQL container on MK7 for Nextcloud AIO is too heavy | @@ -15,6 +15,7 @@ | 10 | **Beszel alert thresholds** — CPU %, memory %, disk % triggers not defined. | Low | Defaults in Beszel container | ## Outstanding Decisions Required -1. ~~Pi-hole inclusion~~ — **Resolved.** AdGuard Home replaces Pi-hole in Phase 1. +|| 18|1. ~~Pi-hole inclusion~~ — **Resolved.** AdGuard Home replaces Pi-hole in Phase 1. +|| ~~AdGuard Home~~ — **Resolved.** Removed. Technitium built-in ad blocking replaces it. 2. **Authelia two-factor method** — TOTP via app (Google Authenticator) vs WebAuthn/FIDO2 keys? 3. **Home vs remote access** — If Bobby wants to share Jellyfin with friends/family outside Tailscale, public domain + Authelia guard is required. diff --git a/PRDs/fleet-infrastructure-recovery.md b/PRDs/fleet-infrastructure-recovery.md index 0b8c8c9..238cf35 100644 --- a/PRDs/fleet-infrastructure-recovery.md +++ b/PRDs/fleet-infrastructure-recovery.md @@ -16,7 +16,7 @@ Six infrastructure issues are blocking fleet observability, container management |---|-----------|------------| | 1 | Portainer | Bobby can log in, see all stacks/containers | | 2 | Technitium | API responds on port 5380, DNS records queryable | -| 3 | AdGuard | Container stopped, Homepage shows no AdGuard tile | +|| 3 | ~~AdGuard~~ | ~~Container stopped, Homepage shows no AdGuard tile~~ | ~~Removed~~ | Technitium handles ad blocking | | 4 | Traefik TLS | HTTPS works on `*.ai.home` with valid cert | | 5 | Beszel | Every node + every container monitored in dashboard | | 6 | Prometheus | 0 targets down, alert pipeline active | diff --git a/dns-topology.md b/dns-topology.md new file mode 100644 index 0000000..22eb92b --- /dev/null +++ b/dns-topology.md @@ -0,0 +1,152 @@ +# DNS Topology — Iron Legion Homelab + +**Last updated:** 2026-05-30 +**Canonical source:** `Iron-Legion/documentation/dns-topology.md` + +--- + +## Overview + +All DNS resolution for the fleet is handled by **Technitium DNS Server** on MK7. AdGuard Home has been removed — Technitium's built-in ad blocking (blocklist-based) replaces it entirely. + +**Single source of truth:** Technitium is both authoritative for the fleet's private zone and recursive for the public internet. + +--- + +## DNS Architecture + +``` +Client Devices ──→ Router (primary, Cloudflare upstream) + │ + └── Windows 11: secondary → MK7:53 (Technitium) + +MK7 (Technitium DNS, port 53): + ├── Authoritative zone: *.ai.home + │ └── artemis.ai.home, mk7.ai.home, mk44.ai.home, mk5.ai.home, mk33.ai.home, ... + ├── Recursive resolver (root servers for public domains) + │ └── OR Cloudflare DoT forwarder: tls://1.1.1.1 (configurable) + └── Ad blocking: blocklist loaded (StevenBlack / OISD / hBlock — user-configured) +``` + +--- + +## Service Details + +| Attribute | Value | +|-----------|-------| +| **Service** | Technitium DNS Server | +| **Image** | `technitium/dns-server:latest` | +| **Host** | MK7 (`192.168.7.7`, `100.66.70.51` Tailscale) | +| **Published ports** | `53/tcp`, `53/udp` (DNS), `5380/tcp` (Web UI) | +| **Traefik host** | `dns.ai.home` | +| **Compose** | `/opt/iron-legion/docker-swarm/technitium/compose.yml` | +| **Data volume** | `technitium-config` (Docker volume) | + +--- + +## Upstream / Forwarder Config + +| Setting | Value | Notes | +|---------|-------|-------| +| **Forwarder protocol** | DNS over TLS (DoT) | Encrypted queries to Cloudflare | +| **Forwarder address** | `tls://1.1.1.1` | Primary | +| **Fallback** | `tls://1.0.0.1` | Secondary (if configured) | +| **Root-server fallback** | Implicit | Technitium falls back to recursive resolution if forwarder fails | + +**Web UI:** `http://dns.ai.home:5380` or `http://192.168.7.7:5380` +- Settings → DNS Server → Forwarders → Add `tls://1.1.1.1` + +--- + +## Ad Blocking + +Technitium uses a **DNS blocklist** to drop ad/tracker/malware domains at resolution time. + +| Setting | Value | +|---------|-------| +| **Blocklist source** | User-configured (e.g., StevenBlack, OISD, hBlock) | +| **Update interval** | User-configured (recommend: daily) | +| **Whitelist** | `.ai.home` internal zone never blocked | +| **Previous solution** | ~~AdGuard Home~~ — removed | + +**Blocklist config:** Web UI → Settings → Blocking → Blocklists + +--- + +## Zone: `ai.home` + +Technitium is **authoritative** for `.ai.home`. Records are maintained via the web UI or API. + +| Record Type | Examples | +|-------------|----------| +| **A** | `artemis.ai.home → 192.168.15.182` | +| **A** | `mk7.ai.home → 192.168.7.7` | +| **A** | `mk44.ai.home → 192.168.x.x` | +| **CNAME** | `dns.ai.home → mk7.ai.home` | + +**Zone file location:** `/etc/dns/config/zones/ai.home` (inside container) + +--- + +## Client DNS Assignment + +| Client | Primary DNS | Secondary DNS | Notes | +|--------|-------------|---------------|-------| +| **Router** | Cloudflare (1.1.1.1) | — | Default for all LAN devices | +| **Windows 11** | Router | MK7:53 (Technitium) | Ad blocking only on secondary | +| **Tailscale devices** | 100.100.100.100 (MagicDNS) | — | Split-brain: `.ai.home` → 192.168.7.7 | + +**Fleet nodes** (MK33, MK34, MK39, MK42) resolve `.ai.home` against MK7:53 via their LAN gateway or static DNS assignment. + +--- + +## Tailscale Integration + +Tailscale's **MagicDNS** and **split-brain DNS** handle `*.ai.home` for devices connected to the tailnet. + +| Setting | Value | +|---------|-------| +| **Split DNS domain** | `ai.home` | +| **Nameserver** | `192.168.7.7` (MK7 LAN IP) | +| **Override local DNS** | Yes | + +This means: a laptop on Tailscale resolving `artemis.ai.home` hits Tailscale's DNS, which forwards `ai.home` queries to `192.168.7.7` (Technitium). The laptop does NOT need to point its system DNS at MK7. + +**Off-Tailscale:** Devices must point DNS at MK7:53 directly to resolve `.ai.home`. + +--- + +## Migration History + +| Date | Change | +|------|--------| +| 2026-05-25 | AdGuard Home deployed on port 3000/5373 | +| 2026-05-28 | AdGuard paused (port conflict / redundancy concerns) | +| 2026-05-30 | **AdGuard removed.** Technitium blocklist configured. DoT to Cloudflare enabled. | + +--- + +## Troubleshooting + +| Symptom | Cause | Fix | +|---------|-------|-----| +| Can't resolve `.ai.home` | Device not using Technitium | Point DNS at MK7:53 or join Tailscale | +| Ads not blocked | Blocklist not loaded / outdated | Refresh blocklist in Technitium UI | +| Slow resolution | DoT forwarder failing | Check `tls://1.1.1.1` reachability; fall back to root recursion | +| Tailscale IPs unreachable | Device not on Tailscale | Connect to tailnet; 100.x IPs are VPN-only | + +--- + +## Operational Commands + +```bash +# Test resolution from any node +dig @192.168.7.7 artemis.ai.home +short +dig @192.168.7.7 google.com +short + +# Check Technitium container logs +ssh jarvis@mk7.ai.home "docker logs $(docker ps -q -f name=technitium)" + +# Access web UI +open http://dns.ai.home:5380 +``` diff --git a/homelab-services-stack-prd.md b/homelab-services-stack-prd.md index 76bac33..5256758 100644 --- a/homelab-services-stack-prd.md +++ b/homelab-services-stack-prd.md @@ -76,7 +76,7 @@ This PRD is append-only for new services. Modifications to existing entries requ | Node | Role | Services Assigned | |------|------|-------------------| -| **MK7 (mark-vii.ai.home)** | Swarm Manager | ALL Phase 1 infrastructure: Traefik, Technitium DNS, AdGuard Home, Portainer, Prometheus, Beszel, Dozzle, Authelia, Homepage | +| **MK7 (mark-vii.ai.home)** | Swarm Manager | ALL Phase 1 infrastructure: Traefik, Technitium DNS, Portainer, Prometheus, Beszel, Dozzle, Authelia, Homepage | | **MK33, MK34, MK39, MK42** | Swarm Workers | Phase 2 media stack (Jellyfin, Sonarr, Radarr, Prowlarr), distributed workloads, Vaultwarden, Nextcloud | | **Artemis** | AI Foreman / JARVIS | Hermes Agent, Ansible-pull control plane — NOT a service host | @@ -116,8 +116,8 @@ This PRD is append-only for new services. Modifications to existing entries requ | Service | Image | Pulls | Stars | Updated | Placement | Notes | |---------|-------|-------|-------|---------|-----------|-------| | **Traefik** | `traefik` | 3.49B | 3,634 | 2026-05-13 | **Global** | Every node receives ingress routing + Docker socket read-only | -| **Technitium DNS** | `technitium/dns-server` | 8.99M | 156 | 2026-05-09 | **Manager Constraint** | Single authoritative DNS — port 53 on MK7 only | -| **AdGuard Home** | `adguard/adguardhome` | 170.7M | 1,408 | 2026-05-25 | **Replicated (1)** | Single replica on MK7 — port 3000 | +| **Technitium DNS** | `technitium/dns-server` | 8.99M | 156 | 2026-05-09 | **Manager Constraint** | Authoritative `.ai.home` + recursive DNS with DoT forwarder to Cloudflare, ad blocking enabled — port 53 on MK7 only | +| **~~AdGuard Home~~** | ~~`adguard/adguardhome`~~ | ~~170.7M~~ | ~~1,408~~ | ~~2026-05-25~~ | ~~**Removed**~~ | ~~Replaced by Technitium built-in ad blocking~~ | ### Monitoring / Observability | Service | Image | Pulls | Stars | Updated | Placement | Notes | @@ -192,21 +192,22 @@ This PRD is append-only for new services. Modifications to existing entries requ |-----------|--------|--------| | **Technitium (MK7)** | ✅ Deployed | Container running, port 53/5380 open | | **`*.ai.home` zone** | ⏳ Pending | Not yet configured as authoritative — Tailscale MagicDNS currently handles name resolution | -| **AdGuard Home (MK7)** | ✅ Active | Recursive resolver + blocklists on port 3000. Replaces Pi-hole. | +| **Technitium DNS (MK7)** | ✅ Active | Authoritative `.ai.home` + recursive resolver + ad blocking on port 53. | +| **~~AdGuard Home~~** | ~~Removed~~ | ~~Replaced by Technitium built-in ad blocking~~ | **Planned Chain (not yet active):** ``` -Client → Technitium (local record?) → AdGuard Home (recursive + blocklist) → Upstream (Cloudflare/Quad9) +Client → Technitium (authoritative `.ai.home`? → return local record) → Technitium (recursive resolver + blocklist) → Cloudflare DoT / Root Servers ``` **Current Fallback:** Tailscale MagicDNS provides `*.ai.home` resolution via Tailscale IP addresses. Technitium will assume authority once zone records are populated. -- **AdGuard Home admin UI** runs on port 3000. +- **Technitium DNS admin UI** runs on port 5380. ## Port Allocation (Reserved) | Port | Service | |------|---------| -| 53 | DNS (Technitium / AdGuard) | +| 53 | DNS (Technitium) | | 80/443 | HTTP/S (Traefik) | | 3000 | Grafana | | 9090 | Prometheus | @@ -242,7 +243,7 @@ Every service with persistent state uses **bind mounts to on-node directories**. |---------|-----------|---------------|---------------| | **Traefik** | `/opt/iron-legion/traefik/config/` `/opt/iron-legion/traefik/certs/` | MK7 (daily rsync) | < 50 MB | | **Technitium DNS** | `/opt/iron-legion/technitium/config/` | MK7 | < 10 MB | -| **AdGuard Home** | `/opt/iron-legion/adguard/work/` `/opt/iron-legion/adguard/conf/` | MK7 | \u003c 500 MB | +| **~~AdGuard Home~~** | ~~`/opt/iron-legion/adguard/work/`~~ ~~`/opt/iron-legion/adguard/conf/`~~ | ~~Removed~~ | ~~N/A~~ | | **Prometheus** | `/opt/iron-legion/prometheus/data/` | MK7 (retention: 15d local, 90d backup) | 5–20 GB | | **Grafana** | `/opt/iron-legion/grafana/data/` | MK7 | < 500 MB | | **Beszel** | `/opt/iron-legion/beszel/data/` | MK7 | < 1 GB | @@ -331,7 +332,8 @@ traefik.http.middlewares.authelia.forwardauth.address: http://authelia:9091/api/ | Order | Service | Target Node | Why First | Dependencies | |-------|---------|-------------|-----------|--------------| | 1 | **Technitium DNS** | MK7 | Name resolution for internal services | None | -| 2 | **AdGuard Home** | MK7 | Recursive DNS + ad-block | Technitium (via conditional forwarding) | +| 2 | **Technitium DNS** | MK7 | Authoritative + recursive + ad-block | N/A — single service | +| ~~AdGuard Home~~ | ~~Removed~~ | ~~—~~ | ~~Technitium replaces AdGuard~~ | | 3 | **Traefik** | MK7 | Edge router for all HTTP ingress | DNS (needs `*.labs.internal` to resolve) | | 4 | **Authelia** | MK7 | Auth layer before exposing any mgmt UI | Traefik (depends on ForwardAuth middleware) | | 5 | **Portainer** | MK7 | Container management UI | Traefik + Authelia (for secured access) | @@ -395,7 +397,7 @@ traefik.http.middlewares.authelia.forwardauth.address: http://authelia:9091/api/ | 10 | **Beszel alert thresholds** — CPU %, memory %, disk % triggers not defined. | Low | Defaults in Beszel container | ## Outstanding Decisions Required -1. ~~Pi-hole inclusion~~ — **Resolved.** AdGuard Home replaces Pi-hole in Phase 1. Removed from catalog. +1. ~~Pi-hole inclusion~~ — **Resolved.** Technitium built-in ad blocking replaces Pi-hole. 2. **Authelia two-factor method** — TOTP via app (Google Authenticator) vs WebAuthn/FIDO2 keys? 3. **Home vs remote access** — If Bobby wants to share Jellyfin with friends/family outside Tailscale, public domain + Authelia guard is required. diff --git a/prd-status-tracker.md b/prd-status-tracker.md index 6434338..e499784 100644 --- a/prd-status-tracker.md +++ b/prd-status-tracker.md @@ -6,7 +6,7 @@ |-------|--------|--------|-------| | Chunk 1 — Purpose, Scope, Success Criteria | ✅ Complete | `73e42cc` | Merged into `homelab-services-stack-prd.md` | | Chunk 2 — Constraints, Service Catalog, Network Architecture | ✅ Complete | `a3fc718` | Reconciled with live fleet | -| Chunk 3 — Data & Persistence, Security Model | ✅ Complete | `b7cc09c` | Pi-hole fully removed, AdGuard canonical. ACL policy corrected. Split files + master PRD in sync. | +| Chunk 3 — Data & Persistence, Security Model | ✅ Complete | `b7cc09c` | Pi-hole fully removed, Technitium ad blocking canonical. ACL policy corrected. Split files + master PRD in sync. | | Chunk 4 — Deployment Phases, Open Questions, Appendix | ✅ Complete | `f18b978` | All Pi-hole references purged. Split files + master PRD in sync. | ## Operational Documentation diff --git a/swarm.md b/swarm.md index 2966425..0eeb263 100644 --- a/swarm.md +++ b/swarm.md @@ -29,7 +29,7 @@ All services deployed on MK7 manager via `docker stack deploy`. | `portainer` | Portainer CE | replicated | 1/1 | `9000` | `portainer.ai.home` | | `prometheus` | Prometheus | replicated | 1/1 | `9090` | `prom.ai.home` | | `technitium` | Technitium DNS | replicated | 1/1 | `53/tcp`, `53/udp`, `5380` | `dns.ai.home` | -| `adguard` | AdGuard Home | replicated | 1/1 | `3000`, `30053` | `adguard.ai.home` | +| ~~`adguard`~~ | ~~AdGuard Home~~ | ~~removed~~ | ~~—~~ | ~~—~~ | ~~`adguard.ai.home`~~ | | ~~authelia~~ | ~~Authelia~~ | ~~deferred~~ | — | — | ~~`auth.ai.home`~~ | > **Note:** Authelia deferred until local TLS is available (requires `https://auth.ai.home`).