TL;DR: Key Takeaways
- In VCF 9, enabling vSphere Supervisor (Workload Management) is a precondition for VKS, if the Supervisor never goes green, no Kubernetes clusters can be provisioned.
- The most common blocker is a cluster that shows Incompatible on the Workload Management page: usually a missing compatible NSX VDS, DRS/HA not fully enabled, or a licensing gap.
- Silent failures during activation almost always trace back to DNS, NTP, or NSX Manager connectivity, the control plane VMs deploy but never finish configuring.
- Missing or unassigned storage policies stop the control plane VMs and image cache before they start.
- When all else fails, the answer is in
wcpsvc.log, trace the opID and you will see exactly what is stuck.
You click Enable on Workload Management, wait twenty minutes, and… nothing. The cluster is flagged incompatible, or three control plane VMs appear and then sit in a Configuring state that never resolves. Supervisor enablement is the gate everything else in the modern-apps stack passes through, until it is green, there is no VKS, no namespaces, and no Tanzu Kubernetes clusters. The good news is that enablement failures are remarkably consistent. After the first dozen, you stop guessing and start checking the same five things. This post walks each one, with the exact command or setting to verify, so you can go from red to green without opening a support case.
If you are still getting your bearings on the platform, start with VMware Cloud Foundation 9 Explained and the plain-English breakdown in VMware Tanzu in vSphere terms. This article assumes you have those concepts and are now stuck on activation specifically, not on a cluster that already runs (for that, see Why Your VKS Cluster Upgrade Is Stuck).
1. The cluster shows “Incompatible” before you even start
If the Workload Management page greys out your cluster and labels it incompatible, the Supervisor never gets a chance to deploy. Work the prerequisites in order: a valid VCF/vSphere licence with the Kubernetes entitlement, at least two ESXi hosts, fully automated DRS, vSphere HA enabled, a vSphere Distributed Switch (7.0 or later), and sufficient storage capacity. Partial DRS (manual or partially automated) is the single most common miss, it must be fully automated.
When the prerequisites look right but the cluster is still incompatible, stop guessing in the UI and ask the API directly. SSH to vCenter as root and use DCLI to list the precise incompatibility reason:
# List clusters and confirm DRS/HA state
dcli com vmware vcenter cluster list
# Ask vCenter exactly why the cluster is incompatible
dcli com vmware vcenter namespacemanagement clustercompatibility list
The incompatibility_reasons column is explicit. A frequent result is “Cluster is missing compatible NSX VDS”, which means your distributed switch is not prepared for NSX or is the wrong version, fix the transport node / VDS configuration and re-check before retrying.
2. DNS and NTP quietly sabotage the control plane
This is the failure that wastes the most time, because the cluster passes the compatibility check and the control plane VMs actually deploy, they just never finish. Beyond NSX, the two most common reasons for a stalled activation are DNS and NTP connectivity problems. The Supervisor control plane VMs need forward and reverse name resolution and tightly synchronised time across vCenter, the ESXi hosts, and NSX Manager. Skew of more than a few seconds breaks certificate validation between components.
Verify both before you retry:
# Forward and reverse DNS for vCenter, hosts, NSX Manager
nslookup vcenter.lab.local
nslookup 10.0.0.10
# Confirm NTP is reachable and in sync on each ESXi host
esxcli system ntp test
esxcli system ntp stats get
If reverse lookups fail or NTP drifts, fix the infrastructure first. No amount of re-running enablement will succeed while the control plane cannot trust its own certificates.
3. No compatible storage policy is assigned
Storage policies govern where the Supervisor places its control plane VMs, the container image cache, ephemeral disks, and persistent volumes. If the policy you select during enablement matches no datastore visible to every host in the cluster, or if you forgot to create one for NSX-backed Supervisors, activation cannot place the control plane and fails early.
Create a VM storage policy (a vSAN storage policy is typical, or a tag-based policy for non-vSAN datastores) before you start, and confirm it resolves to compatible storage on the target cluster. During the Supervisor wizard, assign that policy to the control plane, ephemeral, and image cache slots. A policy that points at a datastore not mounted on all hosts is functionally the same as having no policy at all.
4. Control plane VMs hang at “Configuring”: NSX Manager connectivity
When the three control plane VMs deploy but stay amber in Configuring, the usual culprit is the NSX Container Plugin (NCP) failing to reach NSX Manager. A known pattern: a Supervisor that was healthy goes back to a configuring/error state after an SSO domain repoint, because NCP still points at the old domain. More generally, validate that networking between the control plane VMs and NSX Manager is functional and that certificates and credentials are valid.
The Supervisor config status in the vSphere Client tells you which condition is unmet (all sixteen must be green), but the real detail lives in the Workload Control Plane service log on vCenter:
# Tail the Workload Management service log during activation
tail -f /var/log/vmware/wcp/wcpsvc.log
# Look for NSX / connection errors specifically
grep -i "nsx|connect|error" /var/log/vmware/wcp/wcpsvc.log | tail -40
An error such as “error occurred when attempting to connect to NSX Manager” points straight at networking, DNS, or certificate trust between the control plane and NSX, not at the Supervisor itself.
5. A failed attempt won’t clean up so you can retry
Sometimes the fix for a botched enablement is to deactivate and start fresh, but the Supervisor refuses to disable, and the UI shows almost nothing. Deactivation runs a cleanup that has to tear down NSX objects, and if that cleanup fails or times out, the operation sits there indefinitely. Trace it from the same log: find the most recent sync operation, grab its opID, then follow that ID to the real error.
# 1) Find the latest sync attempt and note its opID
grep "Attempting to sync supervisor" /var/log/vmware/wcp/wcpsvc.log
# 2) Follow that opID to see exactly what is stuck
grep "<opID-from-above>" /var/log/vmware/wcp/wcpsvc.log
If you see lines like NSX cleanup failed: failed to clean up, the teardown is blocked on orphaned NSX resources. Capture a vCenter or Supervisor support bundle and engage Broadcom Support for the cleanup scripts, do not hand-delete NSX objects on a hunch. One community-reported nuance: one of the three control plane VMs may stay powered off waiting for its template to be removed first, so the deactivation can hang on a single leftover template in the datastore.
Final Thoughts
vSphere Supervisor enablement feels opaque, but it fails in predictable ways. Work the order: confirm compatibility with DCLI, rule out DNS and NTP, assign a storage policy that actually resolves, verify NSX Manager connectivity when VMs hang at Configuring, and trace the opID in wcpsvc.log when a teardown stalls. Four of the five live in the infrastructure around the Supervisor, networking, time, storage, NSX, not in the Supervisor itself. Get those green and Workload Management almost always follows.
References
- Broadcom TechDocs: Troubleshoot Workload Management Enablement Cluster Compatibility Errors
- Broadcom TechDocs: Create Storage Policies for Supervisor with NSX (VCF 9.0)
- William Lam: Debugging a “stuck” vSphere Supervisor being removed



