Skip to main content

Cluster design decisions

Get these decisions right on day one. Changing cluster topology later means downtime, migration, and pain.

Single cluster vs multi-cluster

Start with one cluster. Use namespace isolation with network policies to separate teams and environments. Graduate to multi-cluster only when you need blast radius reduction, multi-region failover, or hard compliance boundaries between workloads.

Common Mistake

Teams spin up a cluster per environment (dev, staging, prod) on day one. You end up managing 9 clusters before you have a single production workload. Start with one cluster, three namespaces.

ScenarioRecommendation
Single team, single regionOne cluster, namespace isolation
Multiple teams, shared complianceOne cluster, namespace + network policy isolation
Multi-region or hard blast radiusMulti-cluster with GitOps
Regulated workloads next to non-regulatedSeparate clusters, separate subscriptions

Node pool strategy

Separate system pools from user pools. Never mix your workloads into the system pool.

# System pool: dedicated to kube-system components
az aks nodepool add \
--cluster-name myaks \
--resource-group myrg \
--name system \
--mode System \
--node-vm-size Standard_D4s_v5 \
--node-count 3 \
--zones 1 2 3 \
--node-taints CriticalAddonsOnly=true:NoSchedule

# User pool: your actual workloads
az aks nodepool add \
--cluster-name myaks \
--resource-group myrg \
--name apps \
--mode User \
--node-vm-size Standard_D8s_v5 \
--min-count 3 \
--max-count 20 \
--enable-cluster-autoscaler \
--zones 1 2 3
Opinion

System node pool: Standard_D4s_v5, 3 nodes, tainted with CriticalAddonsOnly. User pools: pick based on workload. Never mix workloads in the system pool -- a misbehaving app pod should never starve CoreDNS.

VM SKU selection

SeriesUse CaseOpinion
D-series v5General compute, web apps, APIsDefault choice for most workloads
E-series v5Memory-intensive (caches, in-memory DBs)When your app needs >8GB per core
N-seriesGPU, ML/AI inference and trainingSee GPU Node Pools
B-seriesBurstable, dev/test onlyNever for production. Unpredictable performance.
F-series v2Compute-optimized batch processingHigh CPU-to-memory ratio workloads
warning

B-series VMs throttle CPU after consuming burst credits. Your production workload will randomly slow down under sustained load. Use D-series instead.

Region selection

Pick a region that supports Availability Zones and is close to your users. Check GPU SKU availability before committing if you plan AI workloads.

# Check if your desired VM SKU is available in the region
az vm list-skus --location eastus2 --size Standard_D8s_v5 --output table

Preferred regions for new deployments: East US 2, West US 3, North Europe, West Europe. All have full AZ support and broad SKU availability.

Naming conventions

Consistency prevents confusion at scale:

ResourcePatternExample
Clusteraks-{app}-{env}-{region}aks-platform-prod-eus2
Node pool{workload}{size}apps, gpua100, system
Namespace{team}-{service}payments-api, data-pipeline
Resource grouprg-{app}-{env}-{region}rg-platform-prod-eus2
info

Node pool names are limited to 12 characters (Linux) or 6 characters (Windows). Keep them short and meaningful.

Resources