Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

VCF 9 Fleet Lifecycle Management: The Reference Architecture (VCF 9 Series, Part 21)

SDDC Manager is gone. Here is how VCF 9 fleet lifecycle management actually works: the VCF Operations control plane, the software depot, the upgrade sequence, and the design gotchas that catch teams on their first cycle.

VCF 9 Series · Part 21 of 36

TL;DR · Key Takeaways

  • SDDC Manager is gone. In VCF 9, lifecycle ownership lives in VCF Operations Fleet Management, which orchestrates patching and upgrades for the whole fleet from one console.
  • The architecture is a lifecycle control plane (Fleet Management appliance plus a software depot) sitting above the managed components: vCenter, ESX clusters, NSX Managers, and Edge nodes across every VCF instance.
  • One hard sequencing rule governs everything: patch the VCF Operations Fleet Management appliance to the target version first, before any managed component can move to that target.
  • VCF 9 lets you upgrade components independently. That flexibility is real, but it shifts the burden of interoperability validation onto you.
  • VCF 9.1 consolidates lifecycle, depot, log, and identity onto a shared services runtime, raises the ceiling to 5,000 hosts per instance and 256 parallel cluster upgrades, and adds a unified OAuth-based depot.
Who this is for: VCF architects and platform operators planning the lifecycle design for one or more VCF 9 instances.  Prerequisites: a working understanding of the VCF 9 fleet and domain model, plus access to the Broadcom depot or an offline mirror.

If your mental model of VCF lifecycle still has SDDC Manager in the middle of it, you are going to get the sequencing wrong on your first VCF 9 upgrade. SDDC Manager does not exist in VCF 9. The job it used to do, holding the bundle, running prechecks, driving the rolling upgrade, now belongs to VCF Operations and its Fleet Management service. That is not a cosmetic rename. The depot model, the order of operations, and the unit of upgrade all changed with it. This post lays out how the VCF 9 lifecycle architecture actually fits together, where the design decisions are, and the parts that bite teams who treat it like SDDC Manager with a new logo.

Where lifecycle ownership moved

In VCF 9, Fleet Management is a service inside VCF Operations, and it owns lifecycle for the entire platform. One console plans and executes updates across both the global management components and every workload domain, instead of you logging into each vCenter to patch ESX. The practical effect is that lifecycle is now a fleet-level concern by default, not a per-instance chore. If you came from VCF 5.x, the closest mental anchor is “SDDC Manager LCM, but pulled up to the fleet and merged into Operations.” For the fleet and domain model that sits underneath this, see the VCF 9 architecture breakdown earlier in this series.

The reason this matters for design, and not just for trivia, is that the lifecycle plane is now something you size, protect, and back up as a first-class component. It is no longer a single appliance you can shrug at. Get its placement and capacity wrong and you have throttled upgrades for every domain it manages.

The reference architecture

The lifecycle architecture has three layers. At the top is the software depot, online to the Broadcom servers or mirrored offline for disconnected sites. In the middle is the lifecycle control plane: the VCF Operations Fleet Management appliance, binary management, and the upgrade planner. At the bottom are the managed components the plane drives: vCenter instances, ESX clusters, NSX Managers, and Edge nodes, fanned out across each VCF instance in the fleet. Binaries flow down from the depot through the plane to the components. Health, drift, and version state flow back up.

VCF 9 Fleet Lifecycle Architecture Depot feeds the control plane; the control plane drives every managed component Software Depot Online (Broadcom) or offline mirror · OAuth token binaries VCF OPERATIONS · LIFECYCLE CONTROL PLANE Fleet Mgmt appliance / runtime Binary Mgmt download & stage Plan & Precheck drift & health gates orchestrated upgrade / patch MANAGED COMPONENTS · ACROSS VCF INSTANCES VCF Instance A vCenter · ESX clusters NSX Mgr · Edge nodes VCF Instance B vCenter · ESX clusters NSX Mgr · Edge nodes VCF Instance C vCenter · ESX clusters NSX Mgr · Edge nodes
The depot feeds the lifecycle control plane in VCF Operations, which drives every managed component across each VCF instance in the fleet.

Two design points are worth fixing in your head before you draw a single rack diagram. First, the depot is a dependency, not a convenience: if it cannot reach binaries, lifecycle stops. Plan the offline mirror in disconnected and sovereign sites as carefully as you plan the management domain itself. Second, the control plane is a blast-radius concentration. In VCF 9.1 the lifecycle, depot, log, and identity services move onto a shared services runtime, which is the right call architecturally, but it also means one runtime now carries more of the fleet. Size and protect it accordingly, and do not under-provision the management domain to save a couple of hosts.

The depot and binary management

Before anything moves, you connect a depot. The flow in VCF 9.0 is to generate a download token, configure either an online or an offline depot, then use binary management to stage the component bits. Binary management is the centralized place where you download, in advance, every binary an upgrade or patch will need. Staging early is not optional discipline, it is how you keep a maintenance window from stalling halfway through because a binary was not present.

VCF 9.1 cleans this up with a unified software depot service that uses OAuth tokens to manage updates for both connected and disconnected environments through one mechanism. The same release also makes licensing quieter: in connected mode, license files now refresh automatically every 24 hours instead of forcing an administrator to acknowledge a refreshed file every 180 days. Small change, real reduction in the kind of forgotten manual step that expires a license at the worst possible time.

The one sequencing rule you cannot break

Here is the rule that catches almost everyone on their first VCF 9 cycle: you patch the VCF Operations Fleet Management appliance to your target version before you can move any other management component to that target. The lifecycle plane has to understand the version it is about to deliver. Try to jump a vCenter or an ESX cluster ahead of the plane and the upgrade will simply not be offered to you. The order is always the same.

1. Connect depot + generate download token
2. Patch VCF Operations Fleet Management appliance  -> target version
3. Download / stage component binaries (binary management)
4. Plan upgrade  -> run prechecks (health, capacity, interop)
5. Upgrade / patch each component  -> vCenter, ESX, NSX, Edge
6. Re-run drift + compliance check across the fleet

Step 4 is where VCF 9.1 earns its keep. The planner runs prechecks that evaluate overall system health and verify adequate resource capacity before an upgrade starts, so the failures that used to surface mid-window now surface before you commit. Run them, read them, and do not override a failed precheck because a window is closing. That is how you turn a two-hour upgrade into a two-day recovery.

The upgrade sequence you cannot reorderThe Fleet Management appliance must reach the target before anything else moves1Connect depot + generate download token2Patch the Fleet Management appliance to the target (must be first)3Download and stage component binaries4Plan upgrade + run prechecks (health, capacity, interop)5Upgrade each component: vCenter, ESX, NSX, Edge6Re-run drift + compliance check across the fleet
The lifecycle plane has to understand a version before it can deliver it; patch it first.

Flexible upgrades and the interoperability trap

The biggest shift from the old VCF model is that you no longer move the entire stack as one monolithic bundle. VCF 9 lets you upgrade and patch components more independently, which is genuinely better: you can take an urgent vCenter or NSX fix without dragging every other component along, and you can stage a fleet over several windows instead of one heroic weekend. My take, though, is that this is the feature most likely to quietly hurt teams who love it too much.

Independent upgrade does not mean any combination is supported. Just because Fleet Management will let you advance vCenter without touching NSX does not mean the resulting pair is on the interoperability matrix. The discipline that the old bundle enforced for you, keeping components in a validated set, is now your responsibility to enforce by hand. Before any flexible upgrade, validate the target combination against the BOM and the interop matrix for that release. The freedom is worth it. The version drift it enables, if you stop checking, is the new way to build an unsupported fleet without noticing. Configuration drift visibility in VCF 9.1 helps here, surfacing changes across instances, vCenters, and clusters, but a dashboard that shows drift is not the same as a policy that prevents it.

Flexible upgrade: allowed is not supportedIndependent component upgrades are powerful, and easy to drift into an unsupported setWhat VCF lets you doAdvance vCenter or NSX independentlyPatch one component, skip the othersWhat is actually supportedThe target combo is on the interop matrixA validated set per release BOMThe old bundle enforced this for you; now check the BOM and interop matrix before every independent move.
Use the freedom deliberately; validate every target combination against the interop matrix.

Scale numbers, and what actually gates them

VCF 9.1 raises the ceilings hard. A single instance now supports up to 5,000 ESX hosts, double the prior release, and parallel upgrade capacity grows four times to as many as 256 clusters upgrading simultaneously. Those numbers are real, and they matter for very large estates. They are also a trap if you plan to them directly.

The orchestrator advertising 256 parallel clusters does not mean your environment can sustain 256 parallel clusters. Real parallelism is gated by the headroom you have for maintenance mode and vMotion, by the bandwidth between your depot and your hosts, and by how much workload you can shuffle without breaching admission control. The stated maximum is a ceiling, not a plan. Size the actual concurrency to your evacuation capacity, then validate it on a non-production domain. The same realism applies to ESX Live Patch, which can apply certain ESX patches without evacuating or rebooting the host. It is a real reduction in disruption when it applies, but it covers specific patch classes only. Do not design your maintenance windows on the assumption that rolling reboots are gone, because for most upgrades they are not.

One more note on the marketing. Broadcom’s own March 2026 survey of VCF 9 customers reports a 51% reduction in infrastructure management time and a 39% improvement in mean time to repair. Take the exact percentages with the usual pinch of salt for a vendor survey of 44 respondents. The structural reason those numbers point in the right direction is sound, though: one depot, one planner, and one console beat logging into every vCenter to patch ESX by hand. The win is the consolidation, not the headline figure.


Operating the fleet day to day

Beyond the upgrade itself, VCF 9.1 folds the rest of fleet lifecycle into the same plane. Bulk operations let you run certificate imports and renewals across all components at once instead of one node at a time. Fleet-level identity brings VCF-wide role assignments with integrated SSO and identity provider configuration, so you stop juggling separate authentication systems per instance. Password vault integration, including a CyberArk option, keeps credential rotation consistent with the rest of your estate. For VxRail environments, Day 0, 1, and 2 tasks that Dell VxRail Manager used to own can now be driven through VCF Operations on vSAN ReadyNodes.

None of this replaces the monitoring and capacity disciplines that sit next to lifecycle. Drift and health gates are only as good as the observability feeding them, so pair this design with the VCF Operations monitoring and observability setup and the capacity and cost management runbook from earlier in the series. Lifecycle, monitoring, and capacity are three views of the same plane, and they fail together when any one of them is neglected.

Disclaimer: Lifecycle operations change production. Validate the target BOM and interoperability matrix for every component you advance, confirm depot binaries are staged, back up the management components, run the planner prechecks, and test the sequence on a non-production domain before you touch the fleet.

What I’d Do

Treat the lifecycle control plane as a tier-one component: size the services runtime for the whole fleet, back it up, and protect the depot path the way you protect the management domain. Standardize the six-step sequence above as a runbook and refuse to deviate from it, especially the rule that the Fleet Management appliance moves first. Use the flexible upgrade model deliberately, with a BOM and interop check in front of every independent component move, not as an excuse to let versions drift. And plan concurrency to your real evacuation headroom, not to the 256-cluster ceiling on the slide. Do those four things and VCF 9 lifecycle becomes the calmest part of your operations, which is exactly what it was redesigned to be.

How are you handling interoperability validation now that the bundle no longer enforces it for you, by hand, by policy, or not yet at all?

References

VCF 9 Series · Part 21 of 36
« Previous: Part 20  |  VCF 9 Complete Guide  |  Next: Part 22 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading