VCF 9 Reference Architecture: Sizing, Topology and Design Trade-offs (VCF 9 Series, Part 7)

The VCF 9 management domain topology, appliance sizing and the standard versus consolidated decision, plus the one appliance (VCF Automation) that quietly drives your whole BOM.

by

Dr. Pranay Jha

June 13, 2026

No comments

10 minutes

Read Time

VCF 9 Series · Part 7 of 37

TL;DR · Key Takeaways

A simple deployment model is a minimum of 7 appliances. The HA model is a minimum of 13. HA is the production answer.
VCF Automation is a fixed 24 vCPU and 96 GB RAM appliance, times 3 in HA. It is the single biggest driver of management-cluster sizing.
Run NSX Manager as a 3-node cluster in production. Single-node is lab and PoC only.
Avi controllers default to a 3-node cluster. VCF 9.1 adds a supported single-node simple deployment.
Standard architecture (separate management and workload domains) is the validated model. Consolidated is for small or constrained environments.

Who this is for: Architects sizing a VCF 9 management domain and laying out the appliance topology. Prerequisites: Familiarity with the fleet, instance and domain model and a target host BOM.

Reference architectures fail in one of two ways: too thin to survive a host failure, or so padded that the management overhead eats the budget. VCF 9 has one appliance in particular that decides which way your design tips, and most BOMs underweight it. Here is the topology, the sizing, and the trade-offs that actually move the number.

VCF 9 single-instance topology: a dedicated management domain plus independently sized workload domains.

Management domain topology

Management domain appliance inventory. Cluster: minimum 4 hosts on vSAN, NFS, or VMFS on FC.

Component	Simple	HA	Notes
vCenter	1	1
SDDC Manager	1	1
Fleet Manager	1	1	Fleet services serve every instance
Operations Collector	1	1
NSX Manager	1	3	clustered for HA
VCF Operations	1	3	clustered for HA
VCF Automation	1	3	clustered for HA
Base total	7	13	minimum appliance count
VCF Operations for Logs	+3	+3	optional, its own 3-node cluster, on top of base
NSX Edge cluster	day-N	day-N	add when north-south / services are needed
Avi Controller	opt	x3	or single-node in 9.1

The simple deployment model is a minimum of 7 appliances: a single vCenter, SDDC Manager, a single NSX Manager, a single VCF Operations with Fleet Manager and Collector, and a single VCF Automation. The HA model is a minimum of 13: three NSX Managers, three VCF Operations, three VCF Automation, plus the single vCenter, SDDC Manager, Fleet Manager, and Collector. VCF Operations for Logs is an additional component, typically its own three-node cluster, so budget for it on top of the base 13. For production you run HA. The management cluster is a minimum of 4 hosts on vSAN, NFS, or VMFS on FC, sized as covered in the planning checklist.

Appliance sizing that matters

VCF Automation is the heavyweight. It is a fixed one-size appliance at 24 vCPU and 96 GB RAM, with no t-shirt sizing, and it should not be shrunk. In HA that is three of them. NSX Manager is selectable as Medium, Large, or Extra Large, deployed as a 3-node cluster for production or a single node for lab. VCF Operations sizes per node from Extra Small (2 vCPU, 8 GB, around 700 objects) up to Extra Large (24 vCPU, 128 GB, around 100,000 objects), and HA halves effective object capacity because every object is replicated. vCenter follows the familiar Tiny through Extra Large range and the Installer defaults it to a Large-class deployment. The vSAN cluster that backs all of this should be ESA, as argued in Part 6.

HA appliance footprint, totalled

Totalled with production-realistic sizes (vCenter Large, NSX Manager and VCF Operations at Medium, Operations for Logs at Small), an HA management domain runs about 158 vCPU and 570 GB of appliance RAM, and VCF Automation alone is roughly half of that RAM. Per-appliance figures come from the VCF 9 sizing guidance; confirm them against the current docs for your release.

Appliance	HA count	vCPU each	RAM each	RAM subtotal
vCenter (Large)	1	16	39 GB	39 GB
SDDC Manager	1	4	16 GB	16 GB
Fleet Manager	1	4	12 GB	12 GB
Operations Collector	1	8	23 GB	23 GB
NSX Manager (Medium)	3	6	24 GB	72 GB
VCF Operations (Medium)	3	8	32 GB	96 GB
VCF Automation	3	24	96 GB	288 GB
Operations for Logs (Small, additional)	3	4	8 GB	24 GB
Total (incl. Logs)	16	158	n/a	~570 GB

Appliance RAM by component in an HA management domain; VCF Automation is the BOM-buster.

Standard vs consolidated

Standard architecture puts management and workload domains on separate clusters and hosts, each workload domain with its own vCenter and autonomous lifecycle. That is the validated, recommended model and the one to default to. Consolidated collapses management and workload into one cluster with resource pools for isolation, and it is appropriate only for small, PoC, or resource-constrained environments. A note of honesty: the consolidated and standard labels carry over partly from VCF 5.x community framing, and VCF 9 docs lean on the domain model itself, so confirm the exact terminology against the current design guide before you put it in a customer deliverable. Scale maximums inherit from vSphere 9 (96 hosts per cluster, up to 2,500 hosts per vCenter), and you should always confirm against configmax for your release.

VCF Operations node sizes

VCF Operations is the one component where you genuinely choose a size against a scale target, so it is worth having the numbers in front of you. Sizes are per node, and an HA cluster halves the effective object capacity because every object is replicated. A cluster scales to 16 Large nodes or 12 Extra Large nodes.

Size	vCPU	RAM	Objects (per node)
Extra Small	2	8 GB	~700
Small	4	16 GB	~10,000
Medium	8	32 GB	~30,000
Large	16	48 GB	~44,000
Extra Large	24	128 GB	~100,000

Pick the Operations node size against your scale target, then halve it for HA.

Edge and Avi: the day-N footprint

Two components sit outside the base appliance count but belong in your capacity plan from the start. The NSX Edge cluster is optional and deployed as a day-N operation, but you need it the moment you want centralized north-south services or the vSphere Supervisor, so reserve hosts and uplink VLANs for it rather than discovering the requirement later. Avi controllers default to a three-node cluster for HA, with VCF 9.1 adding a supported single-node simple deployment in the VCF Operations UI. VCF places the controllers on the correct management or edge segment automatically, but the three controllers are still real appliances competing for management-domain RAM, so fold them into the same sizing exercise as the rest of the stack. Whether you even need Avi depends on the load-balancing decision in Part 11.

Scale maximums and where to confirm them

VCF 9 inherits its scale ceilings from vSphere 9: up to 96 hosts per cluster and up to 2,500 hosts per vCenter, with no VCF-specific tightening reported. Those are ceilings, not targets, and the practical limit on a management domain is usually RAM for the appliances long before you reach a host-count maximum. Always confirm the exact numbers for your release at configmax.broadcom.com, because maximums shift between point releases and a design that quotes last year’s figure is a design that ages badly. If you are planning stretched or multi-availability-zone clusters, the host-count and witness requirements change again, and that belongs in a dedicated design pass rather than the standard single-site topology here.

My take

The real day-one sizing trap is VCF Automation. Three HA nodes at 24 vCPU and 96 GB each is 72 vCPU and 288 GB of management overhead before a single tenant workload runs, and stacked with three NSX, three Operations, and the Logs appliances, an HA management domain consumes the better part of four hosts worth of RAM in appliances alone. My recommendation: if you are not actually consuming self-service automation in year one, deploy the management domain HA without VCF Automation (the Installer lets you defer it or point at an existing instance) and add it later. But still spec the host RAM as if it is there, because retrofitting 288 GB of headroom into an already-built vSAN cluster is far more painful than buying it up front. The docs tell you to size for it from day one. They do not tell you it is the line item most likely to blow your management-cluster BOM.

Signals you have outgrown a single management domain

A reference architecture is a snapshot, and the useful question is when to leave it. These are the thresholds where a single management domain or a single fleet stops being the right shape. None of them is a hard limit you hit as an error, they are the points where the design starts costing you more than a second instance would.

Signal	What it looks like	Action
Management host pressure	fleet appliances plus growth crowding the four to six host cluster	add hosts or move workload domains out
Blast radius	one fleet event stalls provisioning across unrelated business units	split into a second fleet
Regulatory separation	a tenant needs hard identity and data separation	second fleet, separate SSO
Geographic latency	operations traffic crossing a high latency link to a remote site	local instance, federated view
Object and metric scale	Operations nodes under sustained pressure from object count	scale up nodes or add an instance

Watch these as trends, not as single readings. One busy afternoon is not a signal. The same reading every week for a month is the platform telling you the design has aged out, and the cheapest time to act on it is before an outage makes the decision for you.

The most expensive design change I have had to unwind was a single fleet built to rule everything, chosen because shared Operations and Automation looked convenient. Eighteen months in, a regulated business unit needed hard identity separation, and retrofitting a second fleet meant untangling SSO, RBAC and provisioning that had all been wired through one broker. It took the better part of a quarter. If there is any chance of a regulated tenant later, pay the second fleet tax at design time, when it costs a diagram change instead of a migration.

What’s Next

Build your management-cluster BOM from the appliance footprint up, not from a host count down, and decide deliberately whether VCF Automation lands in year one or year two. With the topology set, the next step is the management domain bring-up. Are you sizing for VCF Automation now, or deferring it and reserving the RAM?

References

VCF 9 Series · Part 7 of 37
« Previous: Part 6 | VCF 9 Complete Guide | Next: Part 8 »

About The Author

Dr. Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

See author's posts

Discover more from Journal of Intelligent Infrastructure

Subscribe to get the latest posts sent to your email.

Tags: Management Domain, Reference Architecture, VCF 9 Series, VCF Operations, VCF9, Workload Domain

July 25, 2026

Management domain topology

Appliance sizing that matters

HA appliance footprint, totalled

Standard vs consolidated

VCF Operations node sizes

Edge and Avi: the day-N footprint

Scale maximums and where to confirm them

My take

Signals you have outgrown a single management domain

What’s Next

References

About The Author

Dr. Pranay Jha

Discover more from Journal of Intelligent Infrastructure

Leave a Reply Cancel reply

Architect’s Toolkit

PJ’s Tools

VMware Cloud Foundation

Nutanix

AI & Cloud-Native Platform

Architecture & Design

About the Author

Dr Pranay Jha