PVE cluster formation: MK33/MK34/MK39 as pve-swarm. NFS active. HA groups configured. N150 corrected.
This commit is contained in:
139
PRD Drafts/pve-three-node-ha-cluster.md
Normal file
139
PRD Drafts/pve-three-node-ha-cluster.md
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
# PVE 3-Node HA Cluster for Iron Legion
|
||||||
|
|
||||||
|
**Status:** Draft | **Author:** Artemis | **Date:** 2026-06-04
|
||||||
|
|
||||||
|
## 1. Objective
|
||||||
|
|
||||||
|
Configure MK33, MK34, and MK39 as a Proxmox VE 3-node cluster with shared NFS storage from TrueNAS. Enable manual live migration of VMs/LXCs between nodes, and optionally automatic HA failover for critical workloads.
|
||||||
|
|
||||||
|
## 2. Current State
|
||||||
|
|
||||||
|
| Node | CPU | RAM | Storage | Role |
|
||||||
|
|------|-----|-----|---------|------|
|
||||||
|
| MK33 (Silver Centurion) | Intel N150 4c/4t | 16GB | Local SSD | PVE HA |
|
||||||
|
| MK34 (Southpaw) | Intel N150 4c/4t | 16GB | Local SSD | PVE HA |
|
||||||
|
| MK39 (Gemini) | Intel N150 4c/4t | 16GB | Local SSD | PVE HA (spare)
|
||||||
|
| TrueNAS SCALE | 4c | 11GB | HDD pool | NFS server |
|
||||||
|
|
||||||
|
All nodes on `192.168.0.0/18`. TrueNAS at `192.168.16.254`.
|
||||||
|
|
||||||
|
## 3. Architecture
|
||||||
|
|
||||||
|
### 3.1 Cluster Model: Proxmox 3-Node Cluster (No Ceph)
|
||||||
|
|
||||||
|
```
|
||||||
|
MK33 (192.168.7.33) ──┐
|
||||||
|
├─ Corosync Ring ── Shared NFS (TrueNAS)
|
||||||
|
MK34 (192.168.7.34) ──┤
|
||||||
|
│
|
||||||
|
MK39 (192.168.7.39) ──┘
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Quorum:** 3-node cluster = 2 votes needed for quorum. If one node dies, remaining 2 form quorum.
|
||||||
|
- **Shared storage:** TrueNAS NFSv4.2 export `/mnt/Ice/Backup`
|
||||||
|
- **HA manager:** Proxmox HA services (`pve-ha-crm`, `pve-ha-lrm`) for automatic restart
|
||||||
|
|
||||||
|
### 3.2 Storage Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
Build on local disk → Test workload → Shutdown → Move disk to NFS → Restart on NFS
|
||||||
|
↓
|
||||||
|
If node fails: HA manager detects → Restarts VM/LXC on surviving node (same NFS disk)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3.3 Workload Planning
|
||||||
|
|
||||||
|
| Type | Count per node | Resources each |
|
||||||
|
|------|---------------|----------------|
|
||||||
|
| VM (general) | 1 | 4 vCPU, 4096 MB RAM |
|
||||||
|
| LXC (lightweight) | 5–10 | 1 vCPU, 512–1024 MB RAM |
|
||||||
|
|
||||||
|
**Total per node estimated:** 9–14 vCPUs (but N100 is 4c/4t — LXCs share cores opportunistically via cgroups)
|
||||||
|
**Total RAM per node:** VM 4GB + 5×1GB LXCs = ~9GB allocated, 7GB headroom
|
||||||
|
|
||||||
|
## 4. Pros vs Cons
|
||||||
|
|
||||||
|
### 4.1 3-Node Cluster (Recommended)
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- Unified web UI for all 3 nodes from any one node
|
||||||
|
- Live migration of VMs/LXCs between nodes (zero downtime)
|
||||||
|
- Automatic HA failover for critical VMs/LXCs
|
||||||
|
- Quorum maintained with 2 of 3 nodes online
|
||||||
|
- Shared NFS storage means VMs are portable across nodes
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- Corosync ring traffic adds minor network overhead
|
||||||
|
- If 2 nodes fail simultaneously, quorum lost, cluster stops
|
||||||
|
- HA failover is restart (brief downtime), not live migration
|
||||||
|
- N100 CPU is modest — 3 VMs + 15 LXCs across cluster is tight but workable
|
||||||
|
|
||||||
|
### 4.2 Standalone Nodes (Current)
|
||||||
|
|
||||||
|
**Pros:**
|
||||||
|
- Simple, no cluster complexity
|
||||||
|
- Node failure doesn't affect others
|
||||||
|
- No Corosync network overhead
|
||||||
|
|
||||||
|
**Cons:**
|
||||||
|
- No live migration — moving a VM requires export/import
|
||||||
|
- No automatic failover — manual intervention if node dies
|
||||||
|
- 3 separate web UIs to manage
|
||||||
|
|
||||||
|
## 5. Implementation Plan
|
||||||
|
|
||||||
|
### Phase 1: Cluster Formation
|
||||||
|
|
||||||
|
1. Add all 3 nodes to `/etc/hosts` on each node (or DNS via Technitium)
|
||||||
|
2. On MK33: `pvecm create iron-legion`
|
||||||
|
3. On MK34/MK39: `pvecm add 192.168.7.33`
|
||||||
|
4. Verify: `pvecm status` shows 3 nodes, quorum 2/3
|
||||||
|
|
||||||
|
### Phase 2: NFS Storage Setup
|
||||||
|
|
||||||
|
1. Ensure TrueNAS exports `/mnt/Ice/Backup` with:
|
||||||
|
- NFSv4.2
|
||||||
|
- `maproot` or `mapall` to `root` (PVE nodes need root access)
|
||||||
|
- ACL allows `192.168.0.0/18`
|
||||||
|
2. On PVE Datacenter → Storage → Add → NFS:
|
||||||
|
- ID: `truenas-backup`
|
||||||
|
- Server: `192.168.16.254`
|
||||||
|
- Export: `/mnt/Ice/Backup`
|
||||||
|
- Content: `images,rootdir`
|
||||||
|
3. Verify storage shows on all 3 nodes
|
||||||
|
|
||||||
|
### Phase 3: HA Configuration
|
||||||
|
|
||||||
|
1. Proxmox HA → Add groups:
|
||||||
|
- `critical`: nodes mk33,mk34,mk39 (any node)
|
||||||
|
- `local-only`: single-node constraint for local-disk VMs
|
||||||
|
2. For each VM/LXC on NFS storage:
|
||||||
|
- Datacenter → HA → Add → Select VM → Group `critical` → Start on any
|
||||||
|
3. Start fencing daemon if IPMI/ watchdog available (optional for N100)
|
||||||
|
|
||||||
|
### Phase 4: Workload Migration Testing
|
||||||
|
|
||||||
|
1. Build a test LXC on local storage
|
||||||
|
2. Migrate disk to NFS: `Move disk` → target `truenas-backup`
|
||||||
|
3. Verify LXC starts from NFS
|
||||||
|
4. Test live migration: right-click → Migrate → select target node
|
||||||
|
5. Test HA failover: power off source node, verify restart on surviving node
|
||||||
|
|
||||||
|
## 6. Open Questions
|
||||||
|
|
||||||
|
1. Do we need HA fencing? (IPMI not available on N100 — watchdog only)
|
||||||
|
2. Should we reserve one node as "management" and only run LXCs on two?
|
||||||
|
3. What's the Tailscale story — do we bind Corosync to LAN only or also Tailscale?
|
||||||
|
|
||||||
|
## 7. Decision Points
|
||||||
|
|
||||||
|
| Decision | Option A | Option B |
|
||||||
|
|----------|----------|----------|
|
||||||
|
| Cluster type | 3-node with quorum (recommended) | 2-node + witness (not recommended) |
|
||||||
|
| HA level | Manual migration only | Full HA with auto-restart |
|
||||||
|
| Storage | NFS only (current) | Add local Ceph later |
|
||||||
|
| Resource reserve | 1 node mostly idle | Distribute evenly |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**Awaiting Commander Bobby review and approval.**
|
||||||
Reference in New Issue
Block a user