# Netbird Self-Hosted Control Plane — Evaluation Report **Author:** F.R.I.D.A.Y. ( Hermes Agent ) **Date:** 2026-05-31 **Status:** Draft — for Commander review before deployment **Scope:** Evaluate Netbird self-hosted control plane as a potential replacement or complement to Tailscale mesh networking for the Iron Legion fleet. --- ## Executive Summary Netbird is an open-source, WireGuard-based mesh VPN that provides peer-to-peer connectivity with a centralized management plane. As of v0.71.4 (May 2026), it now offers **two deployment models** for self-hosting: 1. **Quickstart (single-container, recommended for new deployments)** — Combined management + signal + relay in one `netbird-server` container with embedded Dex IdP. ~5-minute setup via `getting-started.sh` with built-in Traefik and automatic TLS. 2. **Advanced (multi-container, legacy but supported)** — Separate services (management, signal, coturn, relay, dashboard) configured via `management.json` and `docker-compose.yml`. **Key finding:** Netbird now supports running **behind an existing reverse proxy** (Traefik, Nginx, Caddy) as a first-class deployment option. This is significant for the Iron Legion because MK7 already runs Traefik for `*.ai.home` services — we can integrate Netbird without adding a new public-facing edge. --- ## What Netbird Offers (vs. Tailscale) | Feature | Tailscale | Netbird | |---------|-----------|---------| | Underlay protocol | WireGuard | WireGuard | | Control plane | Tailscale Co. cloud | **Self-hostable** | | NAT traversal | DERP relays (cloud-hosted) | Self-hosted Coturn + Relay | | Identity provider | Tailscale accounts / SSO via Auth0, etc. | **Embedded Dex** / Any OIDC IdP | | Network routes | ✅ | ✅ | | DNS split-brain | MagicDNS | Network-wide DNS | | Reverse proxy / funnel | Tailscale Funnel (public) | **Built-in reverse proxy via Netbird Proxy** | | Access controls | ACL policies | **Group + peer policies** | | Linux clients | ✅ | ✅ | | Windows | ✅ | ✅ | | Mobile (iOS/Android) | ✅ | ✅ | | Browser client | ❌ | ✅ | | Open-source | Client only | **Fully open-source** | **For the Iron Legion:** The primary advantage of Netbird is **full ownership of the control plane**. Tailscale depends on Tailscale Inc. infrastructure for coordination and DERP relays; Netbird brings both under our control. --- ## Architecture Overview ### Quickstart (v0.29+, Recommended) ``` [Public Internet] | +-- TCP 80/443 --> Traefik (built-in or external) | | | +-- Dashboard UI (web) | +-- Management API (gRPC over HTTPS) | +-- Signal (gRPC over HTTPS, HTTP/2 ALPN) | +-- Relay (WebSocket over HTTPS) | +-- UDP 3478 --> Coturn (STUN/TURN) | +-- UDP 49152-65535 --> TURN relay ports (legacy) ``` **Combined server container** (`netbird-server`) consolidates: - Management Service — peer orchestration, ACLs, routes, DNS - Signal Service — WebRTC signaling for direct WireGuard connections - Relay Service — WebSocket relay for fallback when direct p2p fails - Embedded Dex — built-in identity provider (local users + external OIDC) - Dashboard — web management UI **New in v0.29:** Management and Signal share port 443 via HTTP/2 ALPN. Previously required separate ports (33073 for management gRPC, 10000 for signal gRPC, 33080 for relay). ### Advanced (legacy multi-container) - `management` — API server + dashboard - `signal` — WebRTC signaling - `relay` — WebSocket fallback relay - `coturn` — TURN/STUN server - `dashboard` — React UI - External IdP required (or Dex deployed separately) **Iron Legion recommendation:** Use the **Quickstart model** unless there's a hard requirement for a separate IdP (Authelia, Keycloak, etc.) that cannot run alongside the embedded Dex. --- ## Deployment Options for Iron Legion ### Option A: Docker Swarm on MK7 (Recommended for Low Friction) Deploy Netbird as a Docker Swarm stack on MK7, using the **existing Traefik** as the reverse proxy. **Pros:** - Already running Swarm + Traefik on MK7 - No new VM or LXC to provision - Can share `traefik-public` network - Traefik handles TLS certs via internal CA or Let's Encrypt **Cons:** - MK7 is already the Swarm manager + DNS + proxy — adding mesh control plane means more load on the same node - If MK7 goes down, both the mesh *and* the Web UI/proxy go down **Port mapping on MK7:** | Port | Protocol | Service | |------|----------|---------| | 80 | TCP | HTTP (redirect + ACME challenge) | | 443 | TCP | HTTPS (Dashboard, Management, Signal, Relay) | | 3478 | UDP | Coturn STUN/TURN | > Note: v0.29+ consolidated ports reduce firewall complexity. If all clients run v0.29+, only need 80/443 + 3478. Legacy clients need 33073, 10000, 33080, and UDP 49152-65535. ### Option B: Dedicated LXC on Proxmox (Recommended for Resilience) Deploy Netbird control plane as an LXC container on one of the Proxmox nodes (MK33/34/39/42), with port forwards via `iptables` or host networking. **Pros:** - Isolated from Docker Swarm failures - Can colocate with MK7 for low latency but separate failure domain - Easier backups via Proxmox scheduled snapshot **Cons:** - Requires provisioning an LXC first - Need to forward UDP 3478 + TCP 443 from host to container **Recommended node:** MK39 (Gemini) — currently underutilized, stable node. ### Option C: PVE VM (Heavy, Overkill) Full VM on Proxmox — unnecessary overhead for a coordination server. **Verdict:** Option B (LXC on MK39) for resilience, or Option A (Swarm on MK7) if simplicity is preferred. --- ## Reverse Proxy Integration The `getting-started.sh` script supports **6 reverse proxy modes**: | Option | Reverse Proxy | Iron Legion Fit | |--------|-------------|------------------| | `[0]` | Built-in Traefik (new container) | Works but redundant — we already have Traefik | | `[1]` | External Traefik (labels only) | **Best fit for Option A** — generates Docker labels for existing Traefik | | `[2]` | Nginx (config template) | Not needed — already running Traefik | | `[3]` | Nginx Proxy Manager | Not needed | | `[4]` | External Caddy | Not needed | | `[5]` | Other/Manual | Fallback if Traefik ALPN doesn't work | **Iron Legion choice:** Option `[1]` — "Existing Traefik" labels. This generates: - `traefik.enable=true` - `traefik.http.routers.netbird-.rule=Host(...)` - `traefik.http.services.netbird-.loadbalancer.server.port=...` - Labels for each endpoint: Dashboard (443), Management gRPC (443), Signal gRPC (443), Relay WebSocket (443) ### Required Traefik EntryPoints Already configured on MK7 Traefik: - `web` (:80) — redirect to HTTPS - `websecure` (:443) — HTTPS + gRPC via HTTP/2 - `traefik-dashboard` (:8080) — dashboard **No new entrypoints needed.** All Netbird services multiplex over 443 via HTTP/2 ALPN. --- ## DNS Requirements Netbird needs two DNS records: | Type | Record | Points To | |------|--------|-----------| | A | `netbird.ai.home` | MK7 (192.168.7.7) or MK39 LXC IP | | CNAME | `*.netbird.ai.home` | `netbird.ai.home` | The wildcard is required for Netbird Proxy — each exposed internal service gets a subdomain (e.g., `service.netbird.ai.home`). **Technitium DNS update:** Add: - `netbird.ai.home` → A → 192.168.7.7 (or LXC IP if Option B) - `*.netbird.ai.home` → CNAME → `netbird.ai.home` > Note: Netbird clients on the mesh resolve `*.netbird.selfhosted` internally. The `ai.home` DNS is only needed for the dashboard web UI and proxy subdomains. --- ## Authentication Strategy Netbird Quickstart includes an **embedded Dex** identity provider with local user management. This is sufficient for Iron Legion's current needs. **Two paths:** ### Path 1: Embedded Dex Only (Recommended for Review) - Local user accounts created via Netbird Dashboard - No dependence on external IdP - Username/password or personal access tokens - Can migrate to external IdP later without re-enrolling devices ### Path 2: Integrate with Existing Authelia (Future) - Authelia on MK7 supports OIDC (added in v4.38+) - Netbird can authenticate against Authelia as the IdP - Single sign-on across all fleet services - More complex setup — save for Phase 2 **Recommendation:** Start with Path 1 (embedded Dex). It's fully functional, requires zero extra infrastructure, and can be migrated to Authelia OIDC later. --- ## Tailscale Coexistence Netbird and Tailscale **can run simultaneously** on the same nodes because they use different WireGuard interfaces and port ranges: - Tailscale: UDP 41641 (WireGuard), port 443/TCP (DERP) - Netbird: UDP 51820 (WireGuard), UDP 3478 (TURN), TCP 443 (management/signal) **Potential conflicts:** - Both want UDP high-ports for NAT traversal — OS assigns ephemeral ports, typically fine - Both manipulate iptables/routing tables — could interfere with default routes - DNS resolution: Tailscale MagicDNS vs. Netbird DNS — whichever binds `/etc/resolv.conf` last wins **Recommended coexistence strategy:** - Primary mesh: Tailscale (currently working, MagicDNS configured for `ai.home`) - Secondary / evaluation: Netbird on a subset of nodes - Use Netbird for specific access-control use cases (e.g., expose certain services via Netbird Proxy) - Do NOT set Netbird as default route unless Tailscale is decommissioned --- ## Netbird Proxy — Replacing Traefik? **Commander question:** "Run alongside possibly replace Traefik as the reverse proxy" **Answer:** Netbird Proxy is NOT a reverse proxy replacement for Traefik. It solves a **different problem**: - **Traefik** (existing on MK7): Routes `*.ai.home` traffic *within* the LAN/WAN to Docker containers. It handles HTTP/HTTPS ingress for services like Portainer, PegaProx, Technitium, etc. - **Netbird Proxy**: Exposes internal Netbird mesh services *to the public internet* via subdomain routing, secured by Netbird's access policies. Think of it as a Tailscale Funnel equivalent. **Example:** - `prometheus.internal.ai.home` is only reachable inside the LAN → traefik routes to Prometheus - `prometheus.netbird.ai.home` could be exposed to a remote user's laptop via Netbird Proxy with per-user ACLs **Verdict:** Keep Traefik. Netbird Proxy complements it for selective external exposure, not replaces it. --- ## Resource Requirements ### Quickstart (single container) | Resource | Min | Recommended | |----------|-----|-------------| | CPU | 1 core | 2 cores | | RAM | 2 GB | 4 GB | | Disk | 10 GB | 20 GB | | Network | Public IP + DNS | Same | ### Advanced (multi-container) | Resource | Min | Recommended | |----------|-----|-------------| | CPU | 2 cores | 4 cores | | RAM | 4 GB | 8 GB | | Disk | 20 GB | 40 GB | | Network | Same | Same | **Iron Legion:** Either MK7 (18 cores, 15 GB RAM) or a Proxmox LXC (easily provisioned with 4 GB RAM, 2 cores) are well within these limits. --- ## Deployment Effort Estimate | Phase | Task | Time | Notes | |-------|------|------|-------| | P0 | Review this report | — | Commander decision point | | P1 | Add DNS records to Technitium | 15 min | `netbird.ai.home` + wildcard | | P2 | Deploy Netbird (Quickstart Option A or B) | 30 min | Run `getting-started.sh`, select option [1] or [0] | | P3 | Create first admin user via `/setup` | 5 min | Web browser | | P4 | Install Netbird client on test nodes | 20 min | 2-3 nodes for validation | | P5 | Configure network routes + ACLs | 45 min | Mirror Tailscale access patterns | | P6 | Evaluate coexistence vs. Tailscale replacement | Ongoing | 1-2 week trial period | **Total hands-on time (if approved):** ~2 hours (+ evaluation period). --- ## Known Issues / Gotchas 1. **ALPN / HTTP/2 requirement:** Netbird v0.29+ consolidated ports require HTTP/2 + ALPN on the reverse proxy. Traefik supports this natively. Nginx requires explicit `http2` directive on `listen`. 2. **Legacy clients:** If any Iron Legion device runs an older Netbird client (< v0.29), you'll need the legacy ports (33073, 10000, 33080, UDP 49152-65535). Allfleet devices should use latest client. 3. **Coturn on cloud VMs:** Oracle Cloud and Hetzner require firewall rules for UDP 3478 beyond just VM-level. Not applicable for LAN but noted for future cloud expansion. 4. **First user setup:** The `/setup` page is **only accessible when zero users exist**. After first admin creation, it redirects to `/login`. To create additional admins, use Dashboard → Settings → Identity Providers or API with PAT. 5. **NTP dependency:** Authelia failed on MK7 due to unsynchronized clock (see MK7 restoration report). Netbird's management service also checks certificate validity — ensure NTP sync on the host. 6. **Wildcard DNS for Proxy:** If enabling Netbird Proxy, the wildcard CNAME is mandatory. Without it, exposed service subdomains won't resolve. --- ## Recommendations ### Immediate (Pre-Deployment) 1. ✅ Commander reviews this report 2. ✅ Decide Option A (Swarm on MK7) vs. Option B (LXC on MK39) 3. ✅ If Option A: verify Traefik HTTP/2 ALPN is active ### Short-Term (If Approved) 1. Deploy Netbird Quickstart with embedded Dex 2. Add `netbird.ai.home` + wildcard to Technitium DNS 3. Install clients on 2-3 test nodes (Cinnamint, Artemis, MK42) 4. Mirror one Tailscale route in Netbird for comparison ### Long-Term (Evaluation After 2 Weeks) 1. Compare latency/connection reliability vs. Tailscale 2. Evaluate Netbird Proxy for selective external access 3. Decide: coexist, replace Tailscale, or decommission Netbird 4. If replacing: migrate MagicDNS zones to Netbird DNS, update all `.ai.home` client configs --- ## References - Netbird Docs (Self-Hosted Quickstart): https://docs.netbird.io/selfhosted/selfhosted-quickstart - Netbird Docs (Advanced Guide): https://docs.netbird.io/selfhosted/selfhosted-guide - GitHub (infrastructure files): https://github.com/netbirdio/netbird/tree/v0.71.4/infrastructure_files - Quickstart install script: `curl -fsSL https://github.com/netbirdio/netbird/releases/latest/download/getting-started.sh | bash` - Reverse Proxy Configuration: https://docs.netbird.io/selfhosted/reverse-proxy - Upgrade / Migration Guide: https://docs.netbird.io/selfhosted/maintenance --- ## Appendix: Netbird vs Tailscale Detailed Comparison | Aspect | Tailscale | Netbird Self-Hosted | |--------|-----------|---------------------| | Control plane ownership | ❌ Tailscale Inc. | ✅ Fully owned | | Relay ownership | ❌ Tailscale DERP | ✅ Self-hosted Coturn | | Cost | Free tier limited; enterprise paid | Free; unlimited | | Identity | External IdP or Tailscale | Embedded Dex or any OIDC | | Web dashboard | ✅ | ✅ (self-hosted) | | API | ✅ | ✅ (REST + gRPC) | | SCIM provisioning | ❌ (manual) | ✅ (Enterprise) | | Network segmentation / ACLs | Yes (JSON ACL) | Yes (groups + policies) | | Exit nodes | ✅ | ✅ | | Subnet routers | ✅ | ✅ | | Browser client | ❌ | ✅ (WebRTC-based) | | Mobile NAT busting | DERP | TURN + direct p2p | --- *Report generated 2026-05-31 by F.R.I.D.A.Y. — awaiting Commander review.*