TL;DR · Key Takeaways
- Three layers, kept distinct: the Supervisor is the platform control plane, vSphere Namespaces are the tenancy boundary, and workload clusters are the conformant Kubernetes your developers use.
- The Supervisor runs as control plane VMs plus a spherelet on every ESX host. It is management plane, not a place to run your apps.
- A vSphere Namespace carries the quota, storage policies, permissions and VM classes a team is allowed to consume. It is the unit of multi-tenancy.
- Workload clusters are defined with ClusterClass and the Cluster API. Their control plane and workers are ordinary VMs, placed and protected by DRS and vSphere HA.
- Size the management subnet for five control-plane addresses, not three. The floating IP and the patch IP are the two people forget.
Most architecture confusion with VKS comes from blurring three things that look similar and behave nothing alike: the Supervisor, the vSphere Namespace, and the workload cluster. Mix them up and you will put a quota in the wrong place, grant a developer rights they should not have, or, the classic, try to deploy a production app straight onto the management plane and wonder why it feels fragile. This part draws the three layers cleanly and names the components that show up in your logs and support cases.
The Supervisor: a Kubernetes control plane inside vSphere
When you enable the Supervisor on a vSphere cluster, vSphere embeds a Kubernetes control plane into the cluster itself rather than bolting an external one on the side. Three Supervisor control plane VMs run the Kubernetes API server and controllers, and a lightweight agent called the spherelet, a kubelet ported natively to ESX, runs on each host so the hosts themselves act as worker nodes for Supervisor-level workloads. A second component, CRX, is a paravirtualized Linux kernel that lets vSphere Pods boot almost as fast as plain containers while keeping a VM-grade isolation boundary. You do not have to operate these by hand, but you will see their names when something breaks.
The Supervisor is a management plane. Its job is to host vSphere Namespaces and to run VKS, which provisions the workload clusters that run your applications. Treating it as a place to deploy production workloads is the most common early mistake, it is infrastructure, not your app platform. One sizing detail that bites people on day one: when three control plane VMs deploy, each takes an IP, one holds a floating IP, and a fifth address is reserved for patching. Size the management subnet for five, not three, or you run out on the first upgrade.
vSphere Namespaces: where tenancy and the responsibility line live
A vSphere Namespace is the boundary the vSphere administrator hands to a team. It carries the resource quotas, the storage policies, the permissions and the VM classes the team is allowed to consume. When someone asks for a VKS cluster, that cluster is created inside a namespace and inherits its limits. So the namespace is the natural unit of multi-tenancy: one team, one namespace, with its own ceiling on CPU, memory and storage, and its own RBAC.
It is also where the division of labour is drawn, and this two-tier model recurs through the entire series. The vSphere administrator sets up the namespace and decides what is permitted inside it, which VM classes, which storage policies, how much quota. The DevOps user with access to that namespace then provisions and manages clusters within those guardrails, and grants their own developers access to the resulting clusters via Kubernetes RBAC. If you remember one thing: namespace permissions decide who can run a cluster; cluster RBAC decides who can use it. We come back to exactly this in Part 10.
Workload clusters: ClusterClass, and nodes that are just VMs
A workload cluster is the conformant Kubernetes cluster your developers actually deploy to. In current VKS you define it with the standard Cluster API: a Cluster object references a ClusterClass, and VKS ships built-in, versioned classes (the builtin-generic line) that encode a tested topology. You declare what you want, the Kubernetes version, node counts, VM classes, and the service reconciles it into a running cluster. This is a real shift from the deprecated TanzuKubernetesCluster API: ClusterClass makes clusters templated and consistent, and upgrades flow through the class rather than through hand-edited manifests. Part 4 walks the provisioning workflow end to end.
Inside the cluster, the control plane and workers are themselves VMs. The control plane runs as a single node for throwaway clusters or three nodes for HA, and workers are grouped into node pools (machine deployments), each pool drawing its size from a VM class. Scale a pool and VKS creates or removes worker VMs; upgrade and it replaces them in a rolling fashion. The quiet strength here is that a VKS worker node is not a black box, it is a VM your existing tooling sees, DRS places, and vSphere HA protects. The trade-off is that cluster shape is bounded by what your VM classes and namespace quota allow, which is exactly why Part 5 treats sizing as a real decision.
| Component | What it is | Layer |
|---|---|---|
| Supervisor control plane VMs | Run the K8s API server and controllers (1 or 3) | Management plane |
| Spherelet | kubelet ported to each ESX host | Management plane |
| CRX | Paravirtual kernel for fast, isolated vSphere Pods | Management plane |
| vSphere Namespace | Quota, policy, RBAC and VM-class boundary | Tenancy |
| Cluster + ClusterClass | Declarative definition of a workload cluster | Workload cluster |
| Node pool (machine deployment) | A group of worker VMs sized from one VM class | Workload cluster |
How a cluster request flows through the three layers
The layers click into place once you trace a single request through them. A platform engineer asks for a cluster by applying a Cluster object into a vSphere Namespace. The Supervisor, acting as the management plane, accepts it, and VKS with Cluster API reconciles that intent: it checks the request against the namespace’s quota and permitted VM classes, then provisions control plane and worker VMs sized from those classes, placed by vSphere across the hosts. Storage comes from the namespace’s storage policies, networking from NSX or VDS. When the VMs are up and the cluster is healthy, the engineer pulls a kubeconfig and hands scoped access to developers. Every step touched a different layer: the namespace enforced the limits, the Supervisor and VKS did the reconciling, and the workload cluster is what the developers actually use.
Seeing it as a flow rather than a static diagram explains where things go wrong. A request that fails on quota is a namespace problem. A request that provisions but never goes ready is a Supervisor, storage or network problem. A cluster that is up but that a developer cannot deploy to is an RBAC problem in the workload cluster. The layer that owns the symptom is the layer to debug, and conflating them is what sends people hunting in the wrong place.
Where vSphere Pods fit, and why you rarely use them
One source of confusion deserves clearing up: the Supervisor can also run vSphere Pods directly, containers that run as lightweight VMs on the hosts via the CRX runtime, without a workload cluster at all. They exist, and they are clever, near-container start times with VM-grade isolation, but for most teams they are not the path you want for applications. Running on the Supervisor means running on the management plane, and it ties your workloads to the platform’s lifecycle rather than to a cluster you can scale, upgrade and delete independently. The mainstream pattern, and the one this series follows, is to run applications in VKS workload clusters and treat the Supervisor as infrastructure.
Knowing vSphere Pods exist matters mostly so you recognise them in the documentation and do not confuse them with VKS clusters. They are a niche tool for specific platform-level cases, not the general answer for shipping workloads. If you find yourself reaching for them for ordinary applications, step back: that is almost always a sign the work belongs in a workload cluster, where it gets its own governed, disposable lifecycle.
What I’d Do
Internalise the three layers before you touch anything else, because almost every later decision sits on one of them. Treat the Supervisor as sacred infrastructure and never schedule business workloads on it. Model one namespace per tenant or per team, and decide its quota, storage policies and VM classes deliberately rather than accepting defaults, that namespace is your governance surface. Then keep workload clusters cheap and disposable: small for dev, three control plane nodes for anything that matters, and always remember the nodes are just VMs your existing operations already know how to handle. Get the layering right and the rest of this series is detail. So, looking at your own environment: is your tenancy actually modelled at the namespace level, or is everything quietly sharing one big namespace and one big cluster?
References
- Broadcom TechDocs: VKS Architecture (VCF 9.0)
- Broadcom TechDocs: Using the builtin-generic ClusterClass
- vSphere Supervisor and VKS Architecture in VCF 9 (VCF 9 Series, Part 24)









