draft: Fleet User Standard PRD (v1)
This commit is contained in:
132
PRD Drafts/fleet-user-standard.md
Normal file
132
PRD Drafts/fleet-user-standard.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Fleet User Standard PRD
|
||||
|
||||
**Status:** Draft — Pending Commander Bobby Review
|
||||
**Author:** Artemis
|
||||
**Date:** 2026-06-03
|
||||
|
||||
---
|
||||
|
||||
## 1. Purpose & Scope
|
||||
|
||||
This PRD defines the **canonical user account standard** for all Iron Legion fleet nodes. It eliminates UID/GID mismatches that cause permission failures in bind-mounted containers (VS Code: Server, Paperclip, etc.) and ensures every node behaves identically for automation.
|
||||
|
||||
**In scope:**
|
||||
- Canonical user `jarvis` — UID/GID, groups, home directory
|
||||
- Container `PUID`/`PGID` mapping rules
|
||||
- Provisioning enforcement (MAAS autoinstall, Ansible, manual install)
|
||||
- Migration path for non-compliant nodes (MK7, Nebuchadnezzar)
|
||||
|
||||
**Out of scope:**
|
||||
- Service-specific runtime users inside containers
|
||||
- TrueNAS / external appliance user models (already documented separately)
|
||||
|
||||
---
|
||||
|
||||
## 2. Success Criteria
|
||||
|
||||
| # | Criterion | How Verified |
|
||||
|---|-----------|-------------|
|
||||
| 1 | Every fleet node has `jarvis` at UID 1000 / GID 1000 | `id jarvis` returns `uid=1000` |
|
||||
| 2 | No node has a competing UID 1000 user (e.g. "ubuntu") | `awk -F: '$3==1000 {print $1}' /etc/passwd` returns only "jarvis" |
|
||||
| 3 | Container compose files use `PUID=1000` / `PGID=1000` without node-specific overrides | `grep -r 'PUID' /opt/iron-legion/docker-swarm/` |
|
||||
| 4 | MAAS/cloud-init autoinstall scripts create jarvis FIRST at UID 1000 | Inspect autoinstall user-data |
|
||||
| 5 | Nebuchadnezzar + MK7 migrated to compliant state | Re-run audit script |
|
||||
|
||||
---
|
||||
|
||||
## 3. The Standard
|
||||
|
||||
### 3.1 Canonical User: `jarvis`
|
||||
|
||||
```yaml
|
||||
username: jarvis
|
||||
uid: 1000
|
||||
gid: 1000
|
||||
home: /home/jarvis
|
||||
shell: /bin/bash
|
||||
groups: [sudo, docker] # node-local groups added post-provision
|
||||
ssh_key_source: ~/.ssh/artemis_key.pub # deployed at provision time
|
||||
```
|
||||
|
||||
### 3.2 Container Mapping Rule
|
||||
|
||||
All LinuxServer.io and similar images MUST use:
|
||||
```yaml
|
||||
environment:
|
||||
- PUID=1000
|
||||
- PGID=1000
|
||||
```
|
||||
|
||||
**No exceptions.** If a node cannot satisfy this, the node is non-compliant and must be migrated — not the compose.
|
||||
|
||||
### 3.3 Provisioning Enforcement
|
||||
|
||||
| Provision Method | Enforcement |
|
||||
|----------------|-------------|
|
||||
| **Manual install** | `useradd -m -u 1000 -s /bin/bash jarvis` before any other human user |
|
||||
| **MAAS autoinstall** | Subiquity `identity` section MUST target `jarvis:1000` **before** cloud-init creates "ubuntu" |
|
||||
| **Ansible playbook** | `ansible.builtin.user:` with `uid: 1000`, `name: jarvis` |
|
||||
| **Docker host (Nebuchadnezzar)** | Base image or `useradd` in Dockerfile prior to app user creation |
|
||||
|
||||
---
|
||||
|
||||
## 4. Fleet Audit Results (Current State)
|
||||
|
||||
| Node | jarvis UID | Competing UID 1000 | Status |
|
||||
|------|-----------|-------------------|--------|
|
||||
| artemis | 1000 | None | ✅ Compliant |
|
||||
| mark44 | 1000 | None | ✅ Compliant |
|
||||
| mark5 | 1000 | None | ✅ Compliant |
|
||||
| mk42 | 1000 | None | ✅ Compliant |
|
||||
| shield | 1000 | None | ✅ Compliant |
|
||||
| igor | 1000 | None | ✅ Compliant |
|
||||
| truenas | 1000 | None | ✅ Compliant |
|
||||
| **mk7** | **1001** | **ubuntu 1000** | ⚠️ **Non-compliant** |
|
||||
| **nebuchadnezzar** | **1002** | **ubuntu 1000, caddy 1001** | ⚠️ **Non-compliant** |
|
||||
|
||||
**Root cause:** MK7 and Nebuchadnezzar were provisioned via cloud-init/MAAS, which created "ubuntu" at UID 1000 before jarvis was added. All manually-built nodes are clean.
|
||||
|
||||
---
|
||||
|
||||
## 5. Remediation Plan
|
||||
|
||||
### 5.1 MK7
|
||||
1. Remove or reassign `ubuntu` user (UID 1000 → 65534 or delete)
|
||||
2. Change `jarvis` UID from 1001 → 1000
|
||||
3. `chown -R jarvis:jarvis /home/jarvis`
|
||||
4. Update VS Code: Server container ownership: `chown -R jarvis:jarvis /home/jarvis/.vscode-ssh`
|
||||
5. Verify compose still works with `PUID=1000`
|
||||
|
||||
### 5.2 Nebuchadnezzar
|
||||
1. Remove or reassign `ubuntu` user
|
||||
2. Remove or reassign `caddy` user (or shift to UID > 2000)
|
||||
3. Change `jarvis` UID from 1002 → 1000
|
||||
4. `chown -R jarvis:jarvis /home/jarvis`
|
||||
5. Audit any container bind mounts for ownership drift
|
||||
|
||||
---
|
||||
|
||||
## 6. Open Questions
|
||||
|
||||
1. **Should we document this in the MAAS curtin preseed** so new PXE-built nodes are auto-compliant?
|
||||
2. **Should we add a fleet-wide Ansible user-enforcement task** that fails the playbook if UID 1000 ≠ jarvis?
|
||||
3. **Is TrueNAS user model** (jarvis=1000, jumpbox=3000, bobby=3001) the exception we keep, or do we align TrueNAS too?
|
||||
|
||||
---
|
||||
|
||||
## 7. Gitea Branch Protection Setup (For Draft → Canon Workflow)
|
||||
|
||||
To enforce peer review for PRDs and all documentation:
|
||||
|
||||
1. **Gitea UI** → Iron-Legion/documentation → Settings → Branches → `main` → **Add Protection Rule**
|
||||
2. Enable:
|
||||
- ✅ **Enable branch protection**
|
||||
- ✅ **Require pull request reviews** → Minimum approvers: **1**
|
||||
- ✅ **Dismiss stale approvals when new commits are pushed**
|
||||
- ✅ **Block merge if required reviewers not approved**
|
||||
3. This forces every PR to have at least one human review before merge.
|
||||
|
||||
Once enabled:
|
||||
- Draft PRDs go to `PRD Drafts/` via fork + PR
|
||||
- Approved PRDs get moved to `PRDs/` (canonical) in the approval commit
|
||||
- All operational docs follow the same fork → PR → review → merge flow
|
||||
Reference in New Issue
Block a user