GPU Cluster & HPC Platform — KAIST AI
Status: Active
Built and migrated individual servers into a 9-node, 68-GPU Slurm-managed fleet — FreeIPA identity, a Netbird WireGuard control plane, and TrueNAS-backed shared home storage. Migrated 50+ users to a unified UID namespace and enforced cgroups-v2 with NUMA-aware GPU bindings for strict per-job CPU/GPU isolation.