Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

VCF 9 Planning and Prerequisites: Sizing, Networking and the Readiness Checklist (VCF 9 Series, Part 4)

Sizing, VLANs, MTU, DNS and the readiness checklist for a VCF 9 deployment. The prerequisites that decide whether your bring-up sails through validation or stalls hours in.

VCF 9 Series · Part 4 of 36

TL;DR · Key Takeaways

  • The management domain needs 4 hosts for production. The Installer may technically accept 3 with vSAN, but 4 is the design requirement so a host can enter maintenance without breaking quorum.
  • You need separate VLANs for ESX management, VM management, vMotion, vSAN, host TEP, edge TEP, and uplinks.
  • The hard MTU floor on the TEP path is 1600 bytes. 1700 is recommended, 9000 is optimal. The Installer validates this and will block on failure.
  • Forward and reverse DNS plus NTP must resolve from the Installer appliance before you start.
  • Size the host TEP IP pool for roughly 2 IPs per host, because each active uplink gets a TEP.
Who this is for: Architects and admins scoping a greenfield VCF 9 deployment.  Prerequisites: Access to your network team, DNS and NTP, and the Broadcom Compatibility Guide for your hardware.

Most failed VCF bring-ups are not failures of the platform. They are failures of preparation that the Installer catches at validation, hours into the job, when fixing them is most disruptive. Get the readiness checklist right and the deployment itself is almost dull. Here is what to nail down before you ever deploy the Installer appliance.

Hardware and host count

Every host has to be on the Broadcom Compatibility Guide. For vSAN ESA specifically, the hosts must match a vSAN ESA ReadyNode profile, not merely contain HCL-listed parts. The management domain is a 4-host cluster for production. The VCF Installer will technically accept a 3-host vSAN minimum (and 2 hosts with external FC, VMFS, or NFS), and lab tricks go lower, but 4 is the design requirement and the reason is operational: with 4 hosts a single host can enter maintenance mode without breaking vSAN quorum. Treat 3-host management as a lab-only shortcut. The full topology and appliance sizing sit in the reference architecture deep-dive.

Network: the VLANs you actually need

VCF 9 wants distinct VLANs, trunked and tagged to every host uplink, for ESX management, VM management, vMotion, vSAN, NSX host overlay (host TEP), NSX edge overlay (edge TEP), and NSX uplinks. Note that management is two VLANs, not one: a VLAN for ESX host management and a separate VLAN for VM management, where the VCF appliances (Operations, Automation, and the rest) run. The VDS must be version 8.0 or later, and NSX is prepared directly on the VDS with no legacy N-VDS.

Size the host TEP IP pool with the per-uplink behaviour in mind. A TEP is assigned per active uplink, so a 2-NIC VDS means 2 TEPs per host. Size that subnet for roughly double the host count, not 1:1, or you will run hosts out of TEP addresses mid-preparation.

VCF 9 host networking at a glanceSeven traffic types over one VDS, two uplinks, MTU 1600+ end to endTraffic types (VLANs)ESX managementVM managementvMotionvSANHost TEP (overlay)Edge TEPNSX uplinksVDS 8.0+NSX-prepared, no N-VDS2 uplinks = 2 TEPs/hostPhysical fabricTrunked + tagged VLANsBGP/static for NSX uplinksMTU 1600+ end to endManagement is two VLANs: ESX host management and VM management (where the VCF appliances run).
One VDS carries every traffic type; each active uplink gets its own host TEP.
Network (VLAN)PurposePlanning note
ESX managementESX host managementStatic VMkernel IPs; forward and reverse DNS
VM managementVCF appliances (Operations, Automation, vCenter, NSX)Separate from ESX mgmt; reserve VIPs
vMotionLive migration trafficDedicated VMkernel; jumbo helps
vSANvSAN data trafficDedicated VMkernel; jumbo recommended
Host TEP (overlay)NSX host overlay (Geneve)~2 IPs per host (per active uplink); MTU 1600+
Edge TEP (overlay)NSX edge overlayRoutable to host TEP; MTU 1600+
NSX uplinksNorth-south to the physical fabricBGP or static peering to the ToR

MTU: the number that blocks the most deployments

The overlay path has a hard MTU floor of 1600 bytes end to end. Broadcom recommends 1700 to absorb Geneve header expansion and future-proof, and 9000 (jumbo) for optimal throughput where the underlay supports it. The catch is that the VCF Installer enforces a 1600 MTU validation on the TEP path and blocks deployment if it fails. It pings TEP to TEP, and any single router or switch in the path that clamps below 1600 fails the check.

TargetMTU (bytes)Notes
Hard floor (TEP path)1600Installer validates TEP to TEP and blocks below this
Recommended1700Absorbs Geneve header growth, future-proofs
Optimal (jumbo)9000Best throughput where the underlay supports it end to end
# Validate the 1600 floor between TEP VMkernels (1572 payload + headers)
vmkping -I vmk10 -d -s 1572 <remote-host-tep-ip>

# Validate jumbo (9000) end to end if the underlay supports it
vmkping -I vmk10 -d -s 8972 <remote-host-tep-ip>
MTU: the hop that blocks deploymentsThe Installer validates 1600 bytes TEP to TEP, across the whole pathHost A TEP1600+Leaf A1600+Core / L3clamps to 15001600+Leaf B1600+Host B TEPResult: one hop below 1600 fails the TEP-to-TEP check and blocks the deployment.Validate across racks and leaf pairs: vmkping -I vmk10 -d -s 1572 (remote host TEP IP)
A single L3 hop clamping below 1600 fails the whole deployment, so test across racks, not just neighbours.

DNS, NTP, and credentials

Every component, including the VCF Installer appliance itself, needs both forward (A) and reverse (PTR) DNS records resolvable before deployment. NTP time sync is a hard prerequisite, because certificate and vSAN operations depend on it. For the UI-driven wizard, the ESX hosts need a common root password. If your hosts have different passwords, you must use a JSON specification file instead of the wizard. Plan your IP and subnet allocations per VLAN up front so you are not inventing addresses during input.

Disclaimer: Validate hardware against the Broadcom Compatibility Guide, confirm interoperability of the target BOM, verify MTU end to end across racks, and confirm DNS forward and reverse plus NTP from the Installer appliance before you begin. Treat any documented validation bypass as lab-only.

Storage prerequisites, not just host count

Beyond the four-host floor, the datastore has to be genuinely shared: accessible from and writable by every host in the cluster, with enough free space for the full deployment per the planning workbook. If multiple datastores are present, VCF picks the principal by a fixed priority, vSAN first, then NFS v3, VMFS, NFS 4.1, iSCSI, and vVols last, and from 9.0.1 the datastore with the most free space wins instead. vVols is deprecated in version 9, so do not design a new domain around it. For vSAN ESA the hosts must match an ESA ReadyNode profile, and an OSA cluster expects deduplication and compression either both on or both off. Getting the storage type and datastore layout decided up front matters because the storage choice is welded on at host commission time, as covered in the vSAN ESA versus OSA storage design breakdown.

Datastore principal selection (9.0.0)If several datastores exist, VCF picks the principal by this fixed priorityhighest prioritylowestvSANNFS v3VMFSNFS 4.1iSCSIvVolsdeprecatedFrom 9.0.1, the datastore with the most free space wins instead. vVols is deprecated, do not design around it.
Decide storage type up front; it is welded on at host commission time.

The depot, certificates, and passwords

Three readiness items get forgotten because they are not network or hardware. First, the Installer ships without binaries, so decide in advance whether you are pulling from the online Broadcom depot or building a private offline depot, and start that download early because it is the longest single wait. Second, regenerate the ESX self-signed certificates against each host FQDN before you begin, and delete stale disk partitions so vSAN can claim the devices cleanly. Third, the UI wizard needs a common ESX root password across the hosts. If they differ, you are on the JSON specification path, which is also the better route for repeatable multi-site builds. None of these block you technically, but each one stalls a bring-up at an annoying moment if you discover it live.

Plan the IP space once

Build the IP and subnet plan as a single table before you open the wizard, because you will enter values for every VLAN and every appliance, and inventing addresses mid-input is how typos and overlaps creep in. Allocate static ranges for ESX management, VM management, vMotion, and vSAN, size the host TEP subnet for roughly two addresses per host, and reserve VIPs for the NSX Manager cluster and the fleet appliances. Statically assigned VMkernel IPs are a hard requirement, not a preference, so a plan that assumes DHCP anywhere on the VMkernel will fail validation. Spend an hour on the address plan now and you save the back-and-forth that otherwise stretches a deployment across two evenings.

My take

The single most common day-zero blocker is the 1600 MTU TEP validation. Teams set jumbo frames on the VDS and host uplinks but forget the physical switch fabric and any intermediate L3 hop must carry at least 1600 end to end. The Installer pings across it and hard-fails if one router silently fragments. Validate underlay MTU with vmkping between hosts on different racks and leaf pairs, not just same-rack neighbours, because the failure is almost always at the L3 boundary. And do not reach for the documented MTU validation skip flag to push past it. You are not fixing the path, you are deferring the failure into production NSX as intermittent packet loss, which is far harder to diagnose than a red check in the wizard. The network mistakes that follow from skipping this are catalogued in Part 5.

VCF 9 pre-flight readiness checklistGreen every line before the Installer bootsHardware on Broadcom Compatibility Guide4 hosts for the management domain (3 = lab only)Seven VLANs trunked + tagged (mgmt = 2 VLANs)MTU 1600+ on the TEP path (1700 recommended)DNS forward (A) + reverse (PTR) for every componentNTP reachable and in syncShared datastore writable by all hostsDepot chosen (online/offline), download startedCommon ESX root password (else JSON spec)Static VMkernel IPs planned per VLAN
Make it a real document with an owner per line; dry-run DNS and MTU first.

What’s Next

Build the readiness checklist as a real document with an owner per line, then dry-run DNS and MTU before the Installer ever boots. With prerequisites green, you are ready for the management domain bring-up. Which prerequisite does your environment most often get wrong, DNS reverse records or end-to-end MTU?

References

VCF 9 Series · Part 4 of 36
« Previous: Part 3  |  VCF 9 Complete Guide  |  Next: Part 5 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading