Skip to main content

Azure Kubernetes Fleet Manager

Fleet Manager lets you manage multiple AKS clusters as a single entity. Coordinated upgrades, workload placement, and multi-cluster networking from one control plane.

You need Fleet Manager when you have 3+ clusters

Below that threshold, manage clusters individually. The overhead of Fleet Manager is not justified for 1-2 clusters. At 3+, manual coordination of upgrades and deployments becomes error-prone and time-consuming.

When to use Fleet Manager

ScenarioFleet Manager?Why
1-2 clustersNoManual management is fine
3-5 clusters, same appYesCoordinated upgrades save hours
5+ clusters, multi-regionYesEssential for sanity
Multi-tenant platformYesConsistent policy enforcement
Single cluster, multiple node poolsNoJust use AKS directly

Core concepts

Fleet Hub: A lightweight control plane that coordinates member clusters. It does not run your workloads.

Member Clusters: Your existing AKS clusters joined to the fleet. They retain full independence -- Fleet Manager orchestrates, not owns.

Update Runs: Staged upgrade rollouts across clusters in configurable waves.

Update Stages: Groups of clusters upgraded together within an update run.

Creating a fleet

# Create the fleet hub
az fleet create \
--resource-group myRG \
--name myFleet \
--location eastus2

# Join an existing AKS cluster as a member
az fleet member create \
--resource-group myRG \
--fleet-name myFleet \
--name staging-cluster \
--member-cluster-id /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.ContainerService/managedClusters/staging-aks

# Join production cluster
az fleet member create \
--resource-group myRG \
--fleet-name myFleet \
--name prod-eastus \
--member-cluster-id /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.ContainerService/managedClusters/prod-eastus-aks

Update runs: staged upgrades

This is the killer feature. Instead of upgrading all clusters at once and hoping for the best, you define stages that roll out sequentially.

Example strategy: staging -> prod-region1 -> prod-region2

# Create update run with stages
az fleet updaterun create \
--resource-group myRG \
--fleet-name myFleet \
--name upgrade-to-128 \
--upgrade-type Full \
--kubernetes-version 1.28.5 \
--stages @stages.json

The stages.json defines the rollout order:

{
"stages": [
{
"name": "staging",
"groups": [
{ "name": "staging-group" }
],
"afterStageWaitInSeconds": 3600
},
{
"name": "prod-wave1",
"groups": [
{ "name": "prod-eastus-group" }
],
"afterStageWaitInSeconds": 3600
},
{
"name": "prod-wave2",
"groups": [
{ "name": "prod-westus-group" }
]
}
]
}
Start with fleet-level upgrade orchestration

That alone justifies Fleet Manager. The ability to stage upgrades across clusters with automatic wait periods between stages eliminates the most dangerous operational task in multi-cluster environments. Multi-cluster networking is a bonus feature on top of that.

Update strategies

Define reusable upgrade strategies instead of recreating stages for every update run:

az fleet updatestrategy create \
--resource-group myRG \
--fleet-name myFleet \
--name standard-rollout \
--stages @stages.json

Then reference the strategy in update runs. This gives you consistent, repeatable upgrade patterns.

Auto-upgrade profiles

Instead of triggering update runs manually, Fleet Manager can automatically keep member clusters upgraded using auto-upgrade profiles.

ChannelBehavior
StableUpgrades to N-1 minor version after GA+30 days
RapidUpgrades to latest GA minor version immediately
NodeImageUpgrades node OS images only
TargetKubernetesVersion (preview)Upgrades to a specific K8s version you define

Auto-upgrade profiles use the same UpdateStrategy staging logic, so clusters upgrade in the order you defined (staging -> prod-wave1 -> prod-wave2).

Pair auto-upgrade profiles with update strategies

Set the fleet auto-upgrade profile to Stable and reference your standard-rollout update strategy. This gives you hands-off staged upgrades across your entire fleet -- staging upgrades first, then prod regions in sequence.

Multi-cluster services (preview)

Fleet Manager can expose Kubernetes Services across member clusters using L4 multi-cluster load balancing. Traffic from one cluster can reach pods in another cluster.

Use cases:

  • Active-active deployments where any cluster can serve any request
  • Gradual traffic shifting during migrations
  • Cross-cluster service discovery
Multi-cluster networking adds complexity

Do not enable multi-cluster services unless you have a clear need. It introduces cross-cluster network dependencies that complicate debugging. Most teams only need coordinated upgrades.

Fleet Manager vs manual management

OperationManual (3 clusters)Fleet Manager
K8s upgrade3 separate az aks upgrade commands, manual ordering1 update run, automatic staging
Rollback on failureSSH into each cluster, diagnoseFleet pauses automatically
Audit trailCheck each cluster's activity logCentralized update run history
Policy enforcementApply to each cluster individuallyFleet-level ClusterResourcePlacement
Time to upgrade 5 clustersHours (sequential, manual validation)Minutes (automated, staged)

Common mistakes

  1. Adding Fleet Manager for 1-2 clusters -- Overhead exceeds benefit. Wait until you have 3+.
  2. No wait time between stages -- If staging breaks, you want time to catch it before prod rolls.
  3. All clusters in one stage -- Defeats the purpose. Create meaningful waves (staging, prod-region1, prod-region2).
  4. Ignoring member cluster health -- Fleet Manager will upgrade an unhealthy cluster. Check health before triggering update runs.

Decision: do you need Fleet Manager?

Fleet Manager Decision Tree

Resources