Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

The VCF 9 Automation Toolchain: REST API, SDK, PowerCLI, Terraform and Ansible (Automating VCF Series, Part 2)

VCF 9 has one blessed contract and five good ways to call it. Here is what PowerCLI, Terraform, Ansible, the Unified SDK and curl are each actually for, and how to pick.

Automating VCF Series · Part 2 of 30

TL;DR · Key Takeaways

  • VCF 9 does not have one blessed tool. It has one blessed contract, the Unified VCF REST API, and five good clients that all compile down to it.
  • VCF.PowerCLI 9.1.x (module VMware.Sdk.Vcf.SddcManager) is your imperative, interactive workhorse. The vmware/vcf Terraform provider (0.18.x) owns declarative desired state.
  • Ansible (vmware.vmware / community.vmware, or raw ansible.builtin.uri) layers VCF config next to OS and app config. The Unified SDK (pip install vcf-sdk) is for embedding VCF in real applications.
  • curl is not a toy here. It is the fastest way to see the raw request and response, and a fine CI smoke check.
  • The decision is not capability, it is fit: where the code lives, whether you need state, and whether the task is desired-state-shaped or a one-off action.
Who this is for: VMware admins, platform and DevOps engineers, and architects deciding how to drive VCF 9 programmatically.  Prerequisites: a reachable VCF 9.0 or 9.1 instance, a service account, and the basics from Part 1 on the API-first shift.

Two engineers, one VCF 9 instance, the same task: stand up a workload domain. One opens PowerShell and starts typing. The other writes a few blocks of HCL and opens a pull request. Both are correct, and that is the thing people miss about automating this platform. VCF 9 does not bless one tool over the others. It blesses one contract, then hands you five reasonable ways to call it. Most of the skill is knowing which one to pick for the job in front of you.

The one thing under all of it

Every tool in this series is a client of the same Unified VCF REST API. That is not a slogan, it is mechanical. The VCF.PowerCLI cmdlets in the VMware.Sdk.Vcf.SddcManager module are auto-generated from the API specification. The Terraform provider maps each resource to the same endpoints. An Ansible uri task calls those endpoints directly. The SDK wraps them in typed objects. curl is the endpoint, naked. Once you accept that, tool choice stops being a loyalty test and becomes an engineering decision.

Why this matters in practice

When the API is the thing you actually understand, every tool becomes legible. A cmdlet you have never run maps to an endpoint you can reason about. A Terraform error about a 400 on POST /v1/domains tells you exactly what to go check in the API reference. VCF.PowerCLI even gives you a discovery cmdlet to walk from API path to cmdlet: Get-VcfSddcManagerOperation -Path "*/v1/domains" -Method Get. The first thing I check on any new VCF automation task is which endpoint backs it, then I pick the tool.

Five clients, one contract Every tool compiles down to the same REST call VCF.PowerCLI Terraform Ansible Unified SDK curl Unified VCF REST APIPOST /v1/tokens · GET /v1/domains · one auth, one async model SDDC Manager vCenter NSX vSAN
Pick a tool for ergonomics and state, not capability. They all end at the same endpoint.

The five tools, by job

VCF.PowerCLI 9.1.x

Imperative, interactive, and the closest to how admins already think. You connect with Connect-VcfSddcManagerServer and work with objects: Get-SddcDomain, Get-SddcCluster, Get-SddcHost. Reach for it for day-2 operations and ad-hoc scripts. What I would not do is treat a pile of PowerCLI scripts as your record of desired state, because nothing reconciles them against reality.

Terraform: vmware/vcf 0.18.x

Declarative and stateful. Resources like vcf_domain describe what should exist, and the provider reconciles. Note the namespace: it is vmware/vcf, not a hashicorp one, and it is distinct from vmware/vcfa, the separate provider for VCF Automation catalog and IaaS work. Reach for it when infrastructure should live in Git with a plan and apply gate. Skip it for a single imperative action like commissioning one host, where the state file is just overhead.

Ansible

Procedural but idempotent, and the right call when VCF configuration sits next to OS and application config in the same playbooks. The official vmware.vmware collection and the older community.vmware (which depends on vcf-sdk >= 9.0) cover vSphere-level work, and for anything the modules do not reach yet, ansible.builtin.uri hits the REST API directly. The mistake teams make is wrapping non-idempotent REST calls in Ansible and assuming the playbook is now safe to re-run. It is not, unless you made it so.

Unified VCF SDK (Python / Java)

Install with pip install vcf-sdk on Python 3.10 through 3.14. This is the tool when VCF is one part of a larger application: a provisioning service, an internal portal, a custom reconciliation loop. You get typed objects and real error handling instead of parsing JSON by hand. For a five-line task that a PowerCLI one-liner already covers, the SDK is too much ceremony.

curl

Underrated. When an API is misbehaving, curl shows you the unvarnished request and response with no abstraction in the way. It is also a fine CI smoke test: hit /v1/tokens, confirm a 200, and you know auth and reachability are healthy before the real pipeline runs.

ToolModelStateBest fit
VCF.PowerCLIImperativeNoneDay-2 ops, ad-hoc scripting
Terraform vmware/vcfDeclarativeState fileInfra desired state in Git
AnsibleProceduralInventoryConfig layered with OS / apps
Unified SDKProgrammaticYou own itEmbedding VCF in an app
curlRawNoneDebugging, CI smoke checks
Which tool, when Start from the shape of the task, not the tool you like What shape is the task?desired state, action, or check Lives in Git as state?Terraform vmware/vcf One-off action / day-2?VCF.PowerCLI Quick check / debug?curl Next to OS / app config?Ansible Inside an application?Unified SDK
The same instinct every time: classify the task, then the tool falls out.

Same task, three tools

Listing workload domains is the cheapest task that touches auth, so it is a clean way to feel the ergonomics. Here it is three ways. Watch how the auth handling and output differ while the endpoint stays identical.

# curl: authenticate, then list domains
TOKEN=$(curl -sk -X POST https://sddc-manager.lab.local/v1/tokens 
  -H 'Content-Type: application/json' 
  -d '{"username":"svc-automation@vsphere.local","password":"********"}' 
  | jq -r '.accessToken')

curl -sk https://sddc-manager.lab.local/v1/domains 
  -H "Authorization: Bearer $TOKEN" | jq -r '.elements[].name'
# -> mgmt-domain
#    wld-prod-01
# Failure mode: empty TOKEN means the auth call failed; check creds before the GET.
# VCF.PowerCLI 9.1.x: auth is handled by the connect cmdlet
Connect-VcfSddcManagerServer -Server sddc-manager.lab.local 
  -User svc-automation@vsphere.local -Password '********'

Get-SddcDomain | Select-Object name, status
# name         status
# ----         ------
# mgmt-domain  ACTIVE
# wld-prod-01  ACTIVE
# Ansible: raw REST via ansible.builtin.uri
- name: List VCF workload domains
  hosts: localhost
  gather_facts: false
  tasks:
    - name: Get an access token
      ansible.builtin.uri:
        url: https://sddc-manager.lab.local/v1/tokens
        method: POST
        body_format: json
        body:
          username: svc-automation@vsphere.local
          password: "{{ vault_sddc_password }}"
        validate_certs: false
        status_code: 200
      register: token

    - name: List domains
      ansible.builtin.uri:
        url: https://sddc-manager.lab.local/v1/domains
        method: GET
        headers:
          Authorization: "Bearer {{ token.json.accessToken }}"
        validate_certs: false
      register: domains

Same endpoint, three feels. PowerCLI hides the token entirely. curl makes it explicit, which is exactly what you want when debugging. Ansible turns it into structured, re-runnable tasks but leaves you to manage the token in register variables. None is better in the abstract. They are better for different jobs.

In practice: for a brand-new VCF estate I prototype against curl first, because the raw response teaches me the object shapes faster than any wrapper. Then I rebuild the keepers in PowerCLI or Terraform once I know what the API actually returns.

How they fit together in a real pipeline

You do not pick one tool for the whole estate. You assign each one the work it is shaped for. Terraform owns the infrastructure that has a desired state: domains, clusters, network pools. PowerCLI handles day-2 actions that are not desired-state-shaped, like rotating a certificate or commissioning a single replacement host. The SDK lives inside whatever portal or service your developers actually use. Ansible glues VCF config to the OS and app layers. curl runs as a pre-flight check so the pipeline fails fast on a dead token instead of halfway through an apply.

Tools in one pipeline Each does the job it is shaped for Git commit CI: curl smokeauth + reachability Terraform applydesired state VCF REST APISDDC Manager and friends PowerCLI: day-2certs, host swaps SDK: app integportals, services
One pipeline, several tools, each assigned the work it does best.

Worked example

Take a modest 3-domain estate. Codified in Terraform that is roughly 40 to 60 resources across domains, clusters, and pools. A terraform plan against that completes in well under a minute and shows you drift before you touch production. The same drift check by hand, clicking through three vCenters and SDDC Manager, is an afternoon and still misses things. Codify the state, script the exceptions. The math favours it almost immediately.

Disclaimer: the apply and day-2 examples here mutate platform state. Test against a lab or non-production instance first, use a scoped non-production service account, back up SDDC Manager before lifecycle work, and run terraform plan and read it before every apply.

Where teams get the toolchain wrong

Having five tools available is not the same as using five tools well. The failures I see in the field are rarely about a tool being weak. They are about two tools fighting over the same ground, or a script trusting something the API never promised.

Two tools writing the same resource

The classic break is Terraform and PowerCLI both mutating the same workload domain. You apply your HCL, then someone fixes something live with a quick Get and Set in PowerCLI, and the next terraform plan reports drift it did not cause and politely offers to revert the fix. Pick one owner per resource. If Terraform owns the domain, day-2 changes to that domain go through Terraform too. If you must touch it live, accept that the change is temporary and will be reconciled away on the next apply. The operational cost of skipping this rule is a slow erosion of trust in the plan, which is the one thing that made Terraform worth adopting.

Scraping responses instead of reading the contract

The quieter failure is treating a raw curl response as a stable contract. Field names and nesting can shift between 9.0 and 9.1, and a script that greps a value out of unstructured JSON breaks silently on upgrade, usually at the worst time. Read the documented schema, pin to the fields the API actually promises, and let the SDK or PowerCLI deserialize for you where you can. A 400 you can read beats a wrong value you cannot see.

My take: teams adopt tools faster than they adopt ownership rules. Decide which tool owns which resource before you write the second script, not after the first drift incident at 2am.

What I’d Do

Do not adopt all five tools at once. Pick two and get good. My recommendation for most teams: Terraform with the vmware/vcf provider for anything that represents desired state, and VCF.PowerCLI for the operational work that does not fit a state file. Add the Unified SDK only when you are genuinely embedding VCF inside an application, and keep curl around for debugging and CI checks. Reach for Ansible instead of PowerCLI when your VCF work is inseparable from OS and app configuration. Two tools you know cold beat five you half-remember, every time a pipeline runs at 2am.

For the wider platform picture, the VCF 9 API-first runbook is a useful companion, and VCF Automation in VCF 9 Explained covers the self-service side we will automate later in this series. Which two tools are you standardising on? Tell me in the comments.

Automating VCF Series navigation:
Previous: Part 1, the API-first shift.  Next: Part 3, authentication and access (coming soon).  Up: VCF Automation Guide (pillar).

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading