Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

VCF 9 Certificate, Identity and Backup Failures and How to Fix Them (VCF 9 Series, Part 22)

Certificates, identity and backup all moved in VCF 9. Here are the failures that actually bite in production, with the symptom, the real cause and the fix for each.

VCF 9 Series · Part 22 of 36

TL;DR · Key Takeaways

  • In VCF 9 certificate and identity management moved out of the SDDC Manager UI into VCF Operations under Fleet Management. If you are still driving these from the old UI, your changes may not be where you think they are.
  • The single most common backup failure is a half-configured target: SDDC Manager and NSX use one SFTP setting, while Fleet Manager, VCF Identity Broker and VCF Automation use a completely separate one. Configure both or you have gaps you will only discover during a restore.
  • The replaced SDDC Manager certificate showing as VMCA after a Microsoft CA replacement is a known cosmetic display issue, not a failed replacement.
  • An embedded VCF Identity Broker ties fleet-wide SSO to the management vCenter. For production, run the appliance cluster instead.
Who this is for: VCF admins and architects running day-2 operations on a VCF 9.0 or 9.1 fleet.  Prerequisites: a deployed management domain, VCF Operations access with fleet admin rights, and an external SFTP target you can reach from the management network.

Certificates, identity, and backup are the three operational subsystems nobody thinks about until the day they go wrong, and in VCF 9 all three moved. The SDDC Manager UI is deprecated, certificate and identity workflows now live in VCF Operations, and backup configuration is split across two different screens that look nothing alike. The result is a predictable set of failures: replacements that look like they failed but did not, fleet components quietly running with no backups at all, and an identity layer with a single point of failure that only shows up during a vCenter outage. Here are the ones that actually come up in the field, with the symptom, the real cause, and the fix.

Certs, Identity and Backup in VCF 9 Where each function lives now, and the failure that bites Certificates VCF Operations Fleet Management > Certificates Scopes: VCF Management and workload domains Auto-renewal is new in 9.0 CONSIDERATION Replaced cert still shows as VMCA. Cosmetic only. Validate the VCF adapter connection to confirm. Identity VCF Identity Broker (vIDB) Fleet-wide SSO Embedded or appliance cluster mode AD, Entra ID, Okta, Ping, any SAML 2.0 / OIDC CONSIDERATION Embedded mode ties SSO to mgmt vCenter. No HA. Use the appliance cluster for production fleets. Backup Two separate screens SDDC Mgr + NSX: Admin > SDDC Mgr > Backups Fleet Mgr, vIDB, Automation: Fleet Mgmt > Lifecycle > SFTP CONSIDERATION External SFTP is required to restore SDDC Manager. Default keeps it on the appliance. Move it. VCF 9.0 / 9.1 · SDDC Manager UI deprecated; certificate and identity workflows now run from VCF Operations.
Triage map: where certificates, identity and backup live in VCF 9, and the failure each one hides.

1. The replaced SDDC Manager certificate still shows as VMCA

Symptom. You run the replace-with-configured-CA workflow against SDDC Manager from Fleet Management > Certificates > VCF Instances, point it at a Microsoft CA, and it completes without error. But in the UI only the TLS certificate shows as a Microsoft CA certificate. The root and intermediate still show as VMCA, so it looks like the replacement only half worked.

Likely cause. This is a display behaviour, not a failure. For VCF Management components like SDDC Manager, only the newly generated TLS certificate changes its displayed type in the UI, and it carries the full chain. The original root and intermediate entries do not change their display type, which leads people to assume the replacement failed and re-run it, sometimes repeatedly.

Fix. Do nothing to the certificate. The replacement succeeded and the certificate is valid. To confirm, open Administration > Integrations, select the VMware Cloud Foundation adapter instance, choose Edit, and click Validate Connection. A successful validation is your proof. Documented in Broadcom KB 433185, and worth knowing before you waste an evening re-issuing certificates that were fine.

2. Certificate and identity changes do not appear, because you are in the wrong UI

Symptom. You make a certificate or identity change in the SDDC Manager interface and it does not show up where you expected, or you cannot find the workflow you used in VCF 5.x at all.

Likely cause. The SDDC Manager UI is deprecated in VCF 9.0. Certificate management moved to VCF Operations under Fleet Management > Certificates, and it now splits into two scopes: VCF Management components and workload domains. Identity moved too, fronted by the new VCF Identity Broker. The SDDC Manager UI is still present and many tasks still work from it, but some actions are not reflected immediately in VCF Operations and depend on sync schedules. That lag is what makes people think a change did not take.

Fix. Drive certificate and identity operations from VCF Operations, not the deprecated SDDC Manager UI. When you replace certificates, pick the right scope: VCF Management covers the platform components, while workload domain scope covers the vCenter and NSX of each VI domain. In practice, treat the SDDC Manager UI as read-only muscle memory and retrain the team on Fleet Management now, before an incident forces it.

3. Backups are still sitting on the SDDC Manager appliance

Symptom. A backup schedule exists and reports success, but when you go to restore SDDC Manager you have nothing usable, or the restore procedure refuses to proceed.

Likely cause. By default, SDDC Manager and NSX Manager backups are stored on the SDDC Manager appliance itself. An external SFTP server is a prerequisite for restoring SDDC Manager file-based backups. If the appliance is the thing that failed, a backup that only ever lived on that appliance is gone with it. This is the classic “we had backups” conversation that goes badly during an actual recovery.

Fix. Reconfigure the SFTP backup target for SDDC Manager and NSX Manager so backups land on an external server. Using external SFTP also decouples NSX backups from SDDC Manager, which is a resilience win on its own. Validate the target after configuring it, and confirm a real backup file appears on the SFTP server, not just a green status in the UI.

# VCF Operations > Administration > SDDC Manager > Backups
#   - Backup target: external SFTP (not the appliance default)
#   - Set schedule + retention, then take an on-demand backup
#   - Verify the encrypted backup file actually lands on the SFTP server

# Quick reachability / fingerprint check from the management network
ssh-keyscan -t rsa sftp.example.local
sftp backupuser@sftp.example.local   # confirm write access to the target path

4. Fleet components have no backups, and nobody noticed

Symptom. You configured an SFTP target and assumed everything was covered, but VCF Operations fleet management, VCF Automation, or the VCF Identity Broker turns out to have no backup schedule when you need one.

Likely cause. Backup in VCF 9 is configured in two different places that do not share state. SDDC Manager and NSX are set under Administration > SDDC Manager > Backups. The fleet-level components, namely Fleet Manager, VCF Identity Broker and VCF Automation, are configured separately under Fleet Management > Lifecycle. Setting one does not set the other. This split is the most common operational gap I see on fresh VCF 9 builds, because the first SFTP screen feels like the finish line.

Fix. Configure an SFTP target in VCF Operations for the fleet components, then set a backup schedule per component. Note the backup methods differ: SDDC Manager, NSX Manager, VCF Identity Broker and VCF Automation are file-based, while VCF Operations and its fleet management, logs and networks instances are image-based and depend on a vSphere Storage APIs – Data Protection compatible backup tool. Image-based components are easy to forget precisely because they are not in the file-based SFTP flow at all.

Backup lives in two placesSetting one screen does not cover the otherAdministration > SDDC ManagerSDDC ManagerNSX Managerfile-based, external SFTP requiredFleet Management > LifecycleFleet ManagerVCF Identity BrokerVCF Automationfile-based, per-component scheduleImage-based (separate flow, needs a VADP tool): VCF Operations, Fleet Mgmt, Logs, Networks.
Configure both screens, then verify a file actually lands on the SFTP target.

5. Losing the management vCenter takes down SSO for the whole fleet

Symptom. The management vCenter has a problem and suddenly nobody can log in to VCF Operations, vSphere Client, or NSX Manager across the fleet, turning a single-component issue into a platform-wide access outage.

Likely cause. The VCF Identity Broker can run in two modes. Embedded mode runs inside the management domain vCenter, which is convenient and fine for a lab, but it means your fleet-wide single sign-on now has the management vCenter as a hard dependency with no high availability. Appliance mode runs a standalone Identity Broker cluster built for HA and fleet scale.

Fix. For any production fleet, deploy the VCF Identity Broker as an appliance cluster, not embedded.

My take

Embedded mode is a lab convenience that quietly becomes a single point of failure in production, and retrofitting HA after you have integrated a corporate identity provider is more disruptive than getting it right at deployment. Back the broker up too, because it is file-based and it is the front door to everything else.

Identity Broker: embedded vs applianceEmbedded is a lab convenience; production needs the appliance clusterEmbedded (lab)Runs inside the management vCenterFleet SSO depends on it, no HAAppliance cluster (production)Standalone HA cluster, fleet scaleBack it up; it is file-basedIn embedded mode, losing the management vCenter takes fleet-wide SSO down with it.
Deploy the appliance cluster from the start; retrofitting HA after IdP integration is painful.

6. The SDDC Manager restore fails because it was attempted in place

Symptom. A restore of SDDC Manager errors out or never gets going, often after an attempt to restore onto the existing appliance.

Likely cause. SDDC Manager file-based restore is not an in-place operation. You must download and decrypt the encrypted backup file from the SFTP server, deploy a fresh SDDC Manager appliance, and then restore the backup to it. On top of that, the broader recovery has ordering rules: components must be restored in the right sequence to keep dependencies and system integrity intact.

Fix. Follow the documented restore flow. Retrieve and decrypt the latest backup from external SFTP, stand up a clean SDDC Manager appliance, restore the file-based backup, and sequence the other components rather than restoring everything at once. Practice this at least once in a non-production environment so the first time you decrypt a backup is not during a real outage.


Quick reference: symptom, cause, fix

SymptomLikely causeFix
Replaced cert shows as VMCACosmetic display for VCF Management; only TLS type updatesNo action; Validate Connection on the VCF adapter (KB 433185)
Cert or identity change not appearingUsing deprecated SDDC Manager UI; sync lagUse VCF Operations > Fleet Management; pick correct scope
Cannot restore SDDC ManagerBackups stored on the appliance by defaultReconfigure SFTP for SDDC Manager and NSX
Fleet components have no backupsFleet backup is a separate screen from SDDC Manager/NSXConfigure schedule per fleet component under Fleet Mgmt
Fleet-wide login outageEmbedded Identity Broker tied to mgmt vCenter, no HADeploy the appliance cluster for production
SDDC Manager restore errors outRestore attempted in place; backup not decryptedFresh appliance, decrypt SFTP backup, restore in sequence
Scheduled backups silently failingSFTP storage exhausted against retention policyMonitor SFTP capacity vs retention; alert on it

For the surrounding operational context, these are the related parts of this series worth reading alongside this one: VCF Operations monitoring and observability for how to actually see a failed backup before it matters, VCF 9 fleet lifecycle management for where these settings sit in the broader fleet model, and management domain bring-up for the point at which certificates and identity are first configured.

Disclaimer: Certificate replacement, identity reconfiguration and restore operations are production-impacting changes. Validate the target BOM and interoperability, take a fresh backup before any certificate or identity change, run prechecks, and test the full procedure in a non-production environment first. Always schedule a backup before a certificate update or an upgrade of any management component.

What I’d Do

If I were handed a fresh VCF 9 fleet tomorrow, the first day-2 task would be backup, configured in both places, validated by actually pulling a file off the SFTP target, not by trusting a green tick. Identity Broker goes in as an appliance cluster from the start. Certificates get replaced once, properly, with the team trained to confirm via Validate Connection instead of re-running workflows. None of this is hard. It just lives in different places than it used to, and the failures are all failures of assumption rather than failures of the platform. What is the first thing you check when a “successful” backup turns out to be missing?

References

VCF 9 Series · Part 22 of 36
« Previous: Part 21  |  VCF 9 Complete Guide  |  Next: Part 23 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading