Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

Infrastructure as Code in VCF Automation: The vmware/vcfa Terraform Provider (VCF Automation 9 Series, Part 21)

VCF Automation 9 has no single Terraform provider. Here is how the vmware/vcfa, vra and kubernetes providers split the work, the real HCL for orgs, quota and namespaces, and where the provider still hands you back to the UI.

VCF Automation 9 Series · Part 21 of 41
TL;DR · Key Takeaways
  • There is no single Terraform provider for VCF Automation 9. The work is split across three: vmware/vcfa for provider and org-level constructs, vmware/vra for catalog and blueprints, and hashicorp/kubernetes for All Apps projects, VPCs and namespaces.
  • The vmware/vcfa provider (track 1.1.x against VCF 9.0.1 / 9.1) builds the org, region quota, org networking and content libraries cleanly. It does not yet model the full All Apps consumption layer.
  • Two provider blocks: the System context for provider-side objects, a tenant-org context for consumption.
  • Known gotcha: vcfa_org_regional_networking fails to create against an Active/Active Tier-0, but imports cleanly after the UI creates it.
  • Treat Terraform as the foundation layer, not the whole landing zone. Some steps still belong to the UI, the API, or the Kubernetes path.
Who this is for: Platform and cloud admins who run VCF Automation and want the provider and tenant setup versioned in Git instead of clicked in the portal.
Prerequisites: A working VCF Automation 9 instance (deployed via Fleet Management), an All Apps organization target, VCF networking with a VPC and a vSphere Supervisor already configured, Terraform 1.6+, and a refresh token from an admin account.

The first thing teams ask me when they want VCF Automation under source control is which Terraform provider manages it. The honest answer breaks the question. There isn’t one. VCF Automation (the product formerly named Aria Automation, and vRealize Automation before that) is automated by three providers that each own a slice of the model, and the fastest way to waste an afternoon is to assume vmware/vcfa does everything. It does not. Knowing the split up front is the difference between a clean plan and an hour of guessing why a resource type does not exist.

Three providers, one platform

VCF 9 kept the two organization types from earlier Parts of this series: the VM Apps organization, which behaves like Aria Automation 8.x, and the new All Apps organization, which is Kubernetes-API based and requires an external Orchestrator. That split shows up directly in the tooling. The catalog and blueprint surface is still served by the older vmware/vra provider. The new provider-side and org-level constructs live in vmware/vcfa. And because an All Apps org exposes a Kubernetes interface, projects, VPCs and Supervisor namespaces are managed through the standard hashicorp/kubernetes provider, applying CRDs against the org endpoint.

ProviderSourceWhat it managesContext
VCF Automationvmware/vcfaOrgs, region quota, org and regional networking, content libraries, Supervisor namespacesProvider (System) + tenant
Aria Automationvmware/vraCatalog items, blueprints, VM Apps org consumptionTenant
Kuberneteshashicorp/kubernetesAll Apps projects, VPCs, namespaces, VM and K8s manifestsTenant
Who owns what across the three providersOne VCF Automation instance, three Terraform providers handing off credentialsvmware/vcfaOrganizationsRegion quotaOrg networkingContent librariesNamespacesProvider + tenantvmware/vraBlueprintsCatalog itemsVM Apps consumptionTenant scopehashicorp/kubernetesProjectsVPCsNamespaces (CRD)VM manifestsAll Apps only
The three providers split by responsibility. The vcfa provider can pass its token to the others so a single apply spans all three.
In practice: what I tell clients is to start with one provider per problem. Use vmware/vcfa alone for the provider and org foundation, prove that it is repeatable, and only add the vra and kubernetes providers once the team understands the credential handoff. Mixing all three on day one is the fastest way to a confusing first apply.

Wiring the provider and authentication

VCF Automation 9 lets you mint a refresh token straight from the UI. Open account preferences, go to the API token tab, create a token, and copy it once because the portal will not show it again. That token is what the provider uses with auth_type = api_token. For lab work you can also use integrated auth with a username and password, but a token scoped to a service identity is what I put in a pipeline.

The other decision that bites people is org context. Provider-side objects, an organization, its region quota, its networking, must be created from the System organization with a provider admin identity. Tenant consumption, like retrieving a kubeconfig for a namespace, runs against the tenant org. So a real configuration carries two vcfa provider blocks, one default and one aliased.

# versions.tf
terraform {
  required_providers {
    vcfa = {
      source  = "vmware/vcfa"
      version = "~> 1.1.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.38"
    }
  }
}

# provider.tf -- provider-side context
provider "vcfa" {
  url                  = var.vcfa_url
  org                  = "System"
  auth_type            = "api_token"
  api_token            = var.vcfa_refresh_token
  allow_unverified_ssl = false
}

# tenant context, used later for the namespace kubeconfig
provider "vcfa" {
  alias                = "tenant_blue"
  url                  = var.vcfa_url
  org                  = var.org_name
  auth_type            = "api_token"
  api_token            = var.tenant_refresh_token
  allow_unverified_ssl = false
}
Two provider contexts, one instancePick the wrong org and the resource will not exist in scope1System contextOrg, quota, networkingContent librariesprovider admin token2Tenant contextKubeconfig, namespacesConsumptionorg admin token
Provider-side objects are created from System. Tenant consumption uses an aliased provider scoped to the org.

The foundation sequence

An All Apps landing zone is not one resource. It is a sequence of objects that turn an empty org into something a tenant can consume. The order matters because of dependencies: networking needs the org, the namespace needs quota and networking, the content library needs a storage class that the quota exposed. Build it in the wrong order and Terraform will tell you, but it is faster to know the shape first.

From empty org to consumable landing zoneThe dependency order the vcfa provider expects1vcfa_orgtenant boundary2vcfa_org_oidcidentity3vcfa_org_region_quotacapacity envelope4vcfa_org_networkingorg network base5vcfa_org_regional_*connect to provider gw6vcfa_content_librarycontent source7vcfa_supervisor_namespaceconsumer scope
The tenant-facing foundation order. Provider-side objects like regions and the provider gateway already exist before this runs.

Organization and quota

The organization is the first concrete object. After that, quota is the resource that actually matters for governance, because it is where self-service stops being delegated risk. The quota binds the org to a region, a Supervisor, a set of VM classes and a storage policy, with hard CPU, memory and storage ceilings. Most of the supporting values are looked up with data sources rather than hard-coded, so the same module moves between regions by changing variables.

# main.tf
resource "vcfa_org" "lz" {
  name         = var.org_name
  display_name = var.org_display_name
  description  = "Created by Terraform"
  is_enabled   = true
}

data "vcfa_region" "region" {
  name = var.region_name
}

data "vcfa_supervisor" "sup" {
  name       = var.supervisor_name
  vcenter_id = data.vcfa_vcenter.vc.id
}

resource "vcfa_org_region_quota" "lz" {
  org_id         = vcfa_org.lz.id
  region_id      = data.vcfa_region.region.id
  supervisor_ids = [data.vcfa_supervisor.sup.id]

  zone_resource_allocations {
    region_zone_id   = data.vcfa_region_zone.zone.id
    cpu_limit_mhz    = 20000
    memory_limit_mib = 65536
  }

  region_storage_policy {
    region_storage_policy_id = data.vcfa_region_storage_policy.vsan.id
    storage_limit_mib        = 524288
  }
}
Worked example
Say a tenant gets one zone with cpu_limit_mhz = 20000 (about 20 GHz), memory_limit_mib = 65536 (64 GiB) and storage_limit_mib = 524288 (512 GiB). With a best-effort-small class at 2 vCPU and 4 GiB, that envelope tops out around 16 namespaces’ worth of small VMs on memory before it tops out on storage. Set the numbers deliberately. A quota left wide open is a self-service platform with no brakes, and the first noisy tenant proves it.

Where Terraform stops

This is the part the registry docs will not tell you, and it is the reason I am cautious about promising a single end-to-end module. The vmware/vcfa provider describes the foundation well, but it does not yet model the full VCF Automation 9.1 All Apps consumption layer. Two limits show up fast in real labs.

First, regional networking. The 9.1 UI talks in terms of external connections, but the provider still wants a provider_gateway_id. If that provider gateway is backed by an Active/Active Tier-0, the create fails:

Error: Unable to create Regional Networking Setting because the
Provider Gateway eu-north-1-t0 backing Tier 0 has an unsupported
HA mode, ACTIVE-ACTIVE.

VCF 9.1 supports Active/Active Tier-0 for this design and the UI creates the setting without complaint. So the workaround is to create the regional networking setting once in the UI, then import it into state and let Terraform manage it going forward. The import key is the org name joined to the generated setting name.

terraform import vcfa_org_regional_networking.tenant_blue 
  'tenant-blue.tenant-blueeu-north-1'
Create vs import on an Active/Active Tier-0When the provider cannot create it, let the UI create and Terraform adoptterraform applycreate failsA/A Tier-0UI createsetting realizedone-timeterraform importstate adopts itmanaged after
The import escape hatch keeps a UI-created object under Terraform management without rebuilding it.
Gotcha
Before you blame your HCL for a missing resource type, check whether the object even has a first-class resource. Projects and VPCs do not, in the current provider. The landing-zone pattern uses the default project and the default regional VPC, or drops to the kubernetes provider with a Project CRD (project.cci.vmware.com/v1alpha2). Re-verify the provider version each run, because this surface is moving.

Tenant consumption: a namespace

The namespace is where the foundation hands over to the consumer. It runs against the tenant-scoped provider and, in my testing, the resource still wants explicit storage and zone overrides even though the UI inherits them from the namespace class. That is a small friction worth knowing before a plan surprises you.

resource "vcfa_supervisor_namespace" "payments_dev" {
  provider     = vcfa.tenant_blue
  name_prefix  = "payments-dev"
  project_name = "default-project"
  class_name   = "small"
  region_name  = data.vcfa_region.region.name
  vpc_name     = "default-${var.region_name}"

  storage_classes_class_config_overrides {
    name  = var.storage_policy_name
    limit = "100Gi"
  }

  zones_class_config_overrides {
    name         = var.region_zone_name
    cpu_limit    = "4000M"
    memory_limit = "8192Mi"
  }
}

What works today

Here is the map I keep next to me when scoping a VCF Automation IaC effort. It is the difference between what you can declare cleanly and what still needs a UI step, an import, or the Kubernetes path. Re-check it against the provider version you actually pin.

ObjectResource / approachStatus
Organizationvcfa_orgClean
Identity (OIDC)vcfa_org_oidcClean
Bootstrap accessvcfa_org_local_userWorks (prefer IdP groups in prod)
Region quotavcfa_org_region_quotaClean
Org networkingvcfa_org_networkingClean
Regional networkingvcfa_org_regional_networkingCreate fails on A/A Tier-0; import works
Shared subnetvcfa_shared_subnetCreates; no org-assign argument
Content libraryvcfa_content_libraryClean
Project / VPCkubernetes provider / defaultsNo first-class resource
Namespacevcfa_supervisor_namespaceWorks (explicit overrides)
Blueprintvra_blueprint (vra provider)Clean

This is the same model covered conceptually in Part 3 on VM Apps vs All Apps organizations, and the objects you are declaring map straight to projects and tenant organizations from earlier Parts. The blueprints you publish with the vra provider are the same VMware Cloud Templates from Part 12. If you also automate the underlying SDDC with Terraform, keep that work in the separate Automating VCF toolchain using vmware/vcf; do not mix it with the consumption-side vmware/vcfa here.

Disclaimer: the snippets here change live tenant boundaries, quota and networking. Run them against a lab first, review every terraform plan before apply, and protect state, because a refresh token and provider-side resources sit inside it.

My take: the most useful thing about putting the foundation in HCL is not the 25-second apply. It is that the design becomes reviewable. Which org, which region, what quota, which network, which content. Those are the questions I want answered in a design review anyway, and Terraform forces the answers into a diff.

What I’d Do

Use Terraform for the foundation, deliberately. The vmware/vcfa provider earns its place on org, quota, org networking and content libraries: those objects are declarative, repeatable and exactly what you want in a diff. Why recommend it there and not everywhere? Because regional networking, projects and VPCs are still partial, and pretending otherwise produces a brittle module that fails on the first Active/Active Tier-0. Validate three things before you commit: pin the provider version and re-check the resource list, confirm whether your provider gateway HA mode is supported for create, and decide which objects you will import rather than create. Where I would do it differently from the demos: I would not chase a single all-in-one module yet. Split the provider-side foundation from tenant consumption, accept the import escape hatch for networking, and let the UI or the API own the parts the provider has not caught up to. That is a stronger position than a clever module that breaks every upgrade.

Stand up the org-plus-quota module in a lab this week, run plan, and see how much of your current click-through it replaces before you scope the rest.

VCF Automation 9 Series · Part 21 of 41
« Previous: Part 20  |  VCF Automation Guide  |  Next: Part 22 »

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF Automation 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading