Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

Deploying and Enabling VCF Automation via Fleet Management (VCF Automation Series, Part 4)

VCF Automation is not installed, it is enabled as a fleet component. The prerequisites, the appliance inputs that trip people up, and what to do the moment it comes up.

VCF Automation Series · Part 4 of 30

TL;DR · Key Takeaways

  • VCF Automation is a fleet component, not a standalone install. It is deployed and lifecycle-managed through Fleet Management alongside VCF Operations.
  • A fleet runs a single VCF Operations instance and a single VCF Automation instance, serving every domain and every VCF instance in the fleet.
  • The appliance asks for an admin password, a primary management IP, a second IP used only during upgrades, and a node name prefix. The second IP surprises people.
  • Reserve both IPs in DNS with forward and reverse records before you start. Name resolution problems are the most common deployment stall.
  • Deployment is the easy part. The real work starts at first login, when you set up the provider organization and its infrastructure.
Who this is for: cloud and platform admins standing up VCF Automation on a VCF 9 fleet.  Prerequisites: a healthy VCF 9 management domain with VCF Operations and Fleet Management in place, plus the org-type decision from Part 3.

You do not install VCF Automation. You enable it. That one word change from the vRA days is the whole mental shift for this Part. There is no separate installer to babysit, no vRealize Suite Lifecycle Manager to drive. VCF Automation comes up as a component of the fleet, deployed and patched on the same lifecycle as the rest of VCF. Get that framing right and the steps below are short. Miss it and you go looking for an install wizard that does not exist.

Where VCF Automation lives in the fleet

A VCF fleet is the management envelope around one or more VCF instances. It runs a single set of fleet-level management components, and two of them matter here: one VCF Operations instance and one VCF Automation instance. That single VCF Automation serves every workload domain and every VCF instance in the fleet. You do not deploy one per instance, and trying to is a common early misconception that wastes capacity.

Fleet Management is the appliance that owns the lifecycle of these components. It is what deploys the VCF Automation appliance, and later what patches and upgrades it. So the order is fixed: a healthy management domain, then VCF Operations and the Fleet Management appliance, then VCF Automation as a fleet component on top. If the layers beneath are not solid, do not start the layer above.

VCF Automation in the fleet One instance, serving the whole fleet Management domain (fleet-level components) VCF Operationsfleet operations Fleet Managementowns lifecycle VCF Automationsingle instance VCF instance 1 (domains) VCF instance 2 (domains)
One VCF Automation instance at the fleet level serves every VCF instance beneath it.

Prerequisites before you enable

Most failed VCF Automation deployments are not deployment failures. They are prerequisite failures that surface during deployment, which is the worst time to discover them. Walk this list before you open the wizard.

PrerequisiteWhy it mattersCheck
Healthy management domainVCF Automation runs as a fleet component on itNo active alarms, capacity headroom
VCF Operations + Fleet ManagementFleet Management deploys and lifecycles VCF AutomationBoth deployed and reachable
Two static IPsPrimary plus a temporary upgrade IPReserved and free in your IPAM
Forward and reverse DNSAppliance services expect to resolve their own namesA and PTR records exist for both IPs
Time and NTPToken and certificate logic is time-sensitiveNTP consistent across the fleet

The two-IP requirement is the one that catches people. VCF Automation wants a primary management IP and a second IP that is used temporarily during upgrades. Both need to exist in DNS before you deploy, with forward and reverse records, or the appliance comes up unhappy and you spend an evening on what should have been a checkbox.

# DNS records to create BEFORE deployment (example)
; forward records
vcfa-prod-01.lab.local.        A     10.20.30.40    # primary management IP
vcfa-prod-upg.lab.local.       A     10.20.30.41    # temporary upgrade IP
; reverse records
40.30.20.10.in-addr.arpa.      PTR   vcfa-prod-01.lab.local.
41.30.20.10.in-addr.arpa.      PTR   vcfa-prod-upg.lab.local.
# Both forward and reverse must resolve. Test with nslookup both ways before you start.

Deploying the appliance, step by step

With prerequisites green, the deployment itself is a guided flow driven from Fleet Management. The inputs are few, but each one matters and two of them are permanent enough that you want them right the first time.

Deployment sequence 1 Prereqs green 2 Supply inputs 3 Deploy + start 4 Step 4: validate health in VCF Operations before handing it to anyone
Four moves: confirm prereqs, supply inputs, deploy, validate. The validate step is the one people skip.

The inputs you supply

You provide an administrator password, the primary management IP, the temporary upgrade IP, and a node name prefix that gets prepended to every node name. Pick the prefix deliberately, because it shows up everywhere afterward and renaming is not casual. Size the appliance per the official sizing for your expected tenant count and request volume rather than guessing; undersizing here is felt later as sluggish provisioning.

# VCF Automation appliance, deployment inputs
admin_password:    '************'      # appliance admin credential, store in a vault
primary_ip:        10.20.30.40         # static management IP (has DNS A + PTR)
upgrade_temp_ip:   10.20.30.41         # second IP, used ONLY during upgrades
node_name_prefix:  vcfa-prod           # prepended to every node name, choose with care
size:              
# Both IPs must resolve forward and reverse before you submit this.
Disclaimer: this deploys a fleet component on a production management domain. Snapshot or back up your management components first, run it in a maintenance window, verify capacity before you start, and validate health before you let any tenant near it. Treat it as a change with a rollback plan, not a quick task.

The moment it is up: what to do first

Deployment finishing is not the finish line, it is the starting line. The appliance being healthy means nothing to a user yet, because there is no provider configuration, no infrastructure, and no tenant. The first real task is setting up the provider organization: the top of the tenancy model from Part 1, where infrastructure attaches before any tenant exists.

After deployment, in order Validate healthin VCF Operations Provider orgtop of tenancy Attach infracloud accounts First tenantscoped, small
Validate, then provider, then infrastructure, then a small first tenant. Do not skip ahead.
In practice: the first thing I check after a deploy is health in VCF Operations, not the VCF Automation UI. If Operations is happy with the new component, the appliance is genuinely up; if it is not, the UI loading is a false comfort that hides a service still settling.

Worked example

In a Simple Model fleet you are looking at roughly seven appliances total: vCenter, SDDC Manager, NSX Manager, VCF Operations Manager, the Fleet Management appliance, a VCF Operations collector, and the single VCF Automation appliance. Note that VCF Automation is one of those seven, not a cluster. Plan its two IPs, its DNS records, and its sizing as one unit, and remember the temporary upgrade IP is real address-space you must hold in reserve even though it sits idle between upgrades. Forgetting to reserve it is how a future upgrade fails before it starts.

What stalls a deployment, and how to clear it

When a VCF Automation deployment goes sideways, it is almost always one of a small set of causes, and they are boring ones. Here is the map I use to go from symptom to fix without guessing.

SymptomLikely causeFix
Fails at network validationDNS forward or reverse record missing or mismatchedCreate A and PTR for both IPs, test nslookup both ways
Appliance up but UI unreachableServices still starting, or time skew across the fleetWait for VCF Operations health green, confirm NTP is consistent
A later upgrade refuses to startTemporary upgrade IP was reused or never reservedHold the second IP in IPAM permanently, never reassign it
Deploy blocked on capacityManagement domain has no headroomFree or add capacity before retrying, do not force it
Node names look wrong everywhereNode name prefix chosen carelessly at deploy timeDecide the prefix up front; renaming after the fact is not casual

Notice the pattern: four of those five are settled before you ever open the wizard. DNS, IPs, capacity, and the prefix are all prerequisite-time decisions masquerading as deployment-time failures. The one that genuinely bites later is the temporary upgrade IP, because it sits idle and tempts someone to reclaim it for something else. Document it as reserved-and-in-use even though nothing pings it, or a future upgrade fails at the first gate and you spend the maintenance window hunting an address instead of upgrading.

Sizing and capacity, briefly

Because one VCF Automation instance serves the whole fleet, you size it for aggregate load, not for a single tenant. The inputs that drive the size are the number of tenants and projects you expect and the volume of requests and deployments those tenants will generate. Undersizing does not announce itself at deploy time. It shows up weeks later as sluggish provisioning, requests that queue behind each other, and a UI that feels heavy under load, and by then you have users who associate the platform with slowness.

My recommendation is to size for where you will be in a year, not where you launch. The first month always looks light because only the pilot tenant is on it. Plan for the estate you are actually building toward, confirm the management domain has the capacity to back that size, and leave headroom. Resizing later is possible, but it is a change with downtime and risk, and it is far cheaper to provision correctly once than to relitigate capacity after tenants depend on the platform being responsive.

What’s Next

Deploying VCF Automation is mostly a prerequisites exercise wearing a deployment costume. Get the management domain healthy, get VCF Operations and Fleet Management in place, reserve two IPs with full DNS, pick a node name prefix you can live with, and the appliance comes up cleanly. My recommendation: treat the validate step as mandatory and the provider setup as the real project. The deployment is an afternoon; the provider and tenant design you do next is what people actually experience. Do not let a green appliance fool you into thinking you have a platform. You have a foundation.

Next we look at licensing and editions, so you know what your fleet entitles you to before you scale tenants onto it. For the fleet context, see VCF 9 Multi-Instance Fleet Management. What tripped you up most on your deployment, DNS, IPs, or sizing? Tell me in the comments.

VCF Automation Series navigation:
Previous: Part 3, VM Apps vs All Apps.  Next: Part 5, licensing and editions (coming soon).  Up: VCF Automation Guide (pillar).

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading