Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

Multi-Tenancy at Scale in VCF Automation: Provider, Tenant and Project Design Patterns (VCF Automation 9 Series, Part 24)

How to design provider, tenant and project boundaries in VCF Automation 9.1 that hold up at scale: the six consumption patterns, quota envelopes, a Terraform landing zone, and the design mistakes that cost you later.

VCF Automation 9 Series · Part 24 of 41
TL;DR · Key Takeaways
  • You design three boundaries, not one: the provider organization (infrastructure), the tenant organization (the hard isolation line), and the project (the team envelope inside a tenant).
  • Decide whether a tenant is an organization or a project before you build anything. That single choice sets your isolation, identity, networking and chargeback ceiling.
  • Broadcom ships six multi-tenancy consumption patterns in the VCF 9 design library. Most enterprises land on VPC-per-line-of-business in one Supervisor zone; providers and regulated tenants go to dedicated VRFs or multi-region.
  • Quota is the line between self-service and delegated risk. An organization with no vcfa_org_region_quota is an unbounded blast radius.
  • The vmware/vcfa Terraform provider (~> 1.1.0) describes the organization, quota and networking foundation well, but has no first-class project or VPC resource yet. Plan for a mixed Terraform plus API model.
Who this is for: Cloud and platform architects, automation engineers and operators standing up VCF Automation for more than one consumer.
Prerequisites: You know the VCF Automation services (Automation Assembler, Automation Service Broker, VCF Operations Orchestrator) and the difference between a VM Apps and an All Apps organization. If not, read Parts 3, 6 and 7 first.

A platform team stands up VCF Automation, creates one organization, drops every business unit into it as a project, and calls the result multi-tenant. Eighteen months later finance wants chargeback per business unit, security wants hard network isolation between two of them, and one team’s runaway quota request is starving the rest. None of that is a product defect. It is a tenancy design that was never made on purpose. Multi-tenancy in VCF Automation (the product formerly known as Aria Automation, and before that vRealize Automation) is not a switch you flip. It is a set of boundary decisions you make early and then live with for years. This is the part of the series where we make those decisions deliberately.

The three boundaries you actually design

Forget the word tenant for a moment. VCF Automation gives you three nested boundaries, and a real design assigns meaning to each one. The provider organization (the System context) owns the infrastructure: regions, vSphere Supervisors, provider gateways, storage policies and the quota definitions that everything else draws from. The tenant organization is the hard wall. It gets its own portal, its own identity provider, its own networking scope and its own catalog. The project lives inside a tenant and scopes membership, roles and resource limits for one application team. In an All Apps organization the workload itself lands in a Supervisor Namespace, which sits under a project and inherits a class, a VPC and a slice of quota.

The four boundaries, top to bottomProvider owns infrastructure; the tenant org is the hard isolation line1Provider organization (System)Regions, Supervisors, provider gateways, storage policies, quota definitions2Tenant organizationOwn portal, IdP, networking scope, catalog. The hard wall between tenants3ProjectMembers, roles, resource limits for one application team4Supervisor Namespace (All Apps)Where workloads land: a class, a VPC, a slice of quota
The nesting that every tenancy design assigns meaning to. The isolation strength increases the higher up you draw the boundary.

In an All Apps organization, total isolation between tenants is real and enforced below the product. The vSphere Namespace boundary works in concert with the NSX Project boundary so that two tenants cannot see each other’s compute or network, even though they share the same physical fabric. That is the difference between two organizations and two projects: organizations get that hard boundary, projects share it. A project is a soft envelope inside a trust domain you already accepted.

Pick the tenancy unit before you pick the pattern

When a tenant should be an organization

Make the tenant an organization when you need a hard boundary you can defend in an audit. Separate identity providers, separate networking, separate catalogs, and a clean chargeback line all point to an organization. Service providers carving up capacity for unrelated customers have no other honest option. Regulated business units that cannot share a network trust domain belong in their own organizations too. The cost is real: every organization is its own configuration surface, its own IdP integration, its own day-2 burden. You do not want fifty of them by accident.

When a tenant should be a project

Use a project when the consumers live inside one trust domain and the thing you actually want to vary is membership, roles and a resource cap. Application teams in the same enterprise division usually fit this model. They can share an identity provider, tolerate shared networking, and need lighter governance than a full organization. The win is that projects are cheap to create and scope, and a single organization administrator can run dozens of them. The trap is reaching for projects when you really needed an organization, then bolting on network isolation later with tags and firewall rules that nobody trusts.

In practice: The first thing I check on an existing deployment is how many organizations exist versus how many projects. One organization with thirty projects and three of them clearly wanting network isolation is the most common smell. It means the team used projects as a default and is now paying for a boundary they never designed. Fixing it after workloads land means a migration, not a config change.

The six consumption patterns, and which one to reach for

The VCF 9 design library does not leave tenancy to taste. It documents six multi-tenancy consumption patterns, ordered roughly by how strongly they isolate the network. They differ on how VPCs, Supervisor zones, gateways and VRFs are shared or dedicated. Read them as a spectrum from cheap-and-shared to expensive-and-isolated, and pick the cheapest one that still satisfies your isolation requirement.

PatternShapeReach for it when
1Single VPC, multiple businesses, one Supervisor zoneLightest isolation; teams that can share a network domain
2VPC per line of business, one Supervisor zoneThe common enterprise default; network separation per LOB
3Single VPC across three Supervisor zonesOne LOB spanning zones for availability, shared network
4Dedicated VRF per organizationStrong routing isolation; regulated or hostile-neighbor tenants
5Multi-region, dedicated plus shared gatewaysTenants across regions with common shared services
6Multi-org isolation, shared NSX across workload domainsCentral NSX serving many orgs across distinct domains
Which pattern: a decision pathPick the cheapest pattern that still meets the isolation barSeparate routingor regulated tenant?Per-LOB networkseparation needed?Can share onenetwork domain?Pattern 4 or 5VRF / multi-regionPattern 2VPC per LOBPattern 1 or 3shared VPC
Start at the isolation requirement and walk down. The expensive patterns exist for a reason; do not pay for them without one.

My recommendation: most enterprises should start at Pattern 2, a VPC per line of business inside a single Supervisor zone. It gives real network separation per consumer without the operational weight of dedicated VRFs or multiple regions, and it maps cleanly onto a one-organization-many-projects model when the LOBs share a trust domain. Go to Pattern 4 (dedicated VRF per organization) only when routing isolation is a stated requirement, for example regulated tenants or a provider with mutually distrustful customers. Reach for Pattern 5 when tenants genuinely span regions. Where I would not start: Pattern 1. Shared everything reads as cheap on day one and becomes the thing you unwind first when a tenant needs to be carved out.

Quota is the line between self-service and delegated risk

An organization with no quota is not a tenant. It is an unbounded blast radius. The region connects an organization to underlying VCF capacity; the quota defines how much of that capacity the organization is allowed to consume. Self-service without quota is just delegated risk. Self-service with quota is a controlled consumption model. That distinction is the whole point of multi-tenancy at scale, and it is exactly the layer teams skip when they are racing to a demo.

How a capacity envelope flows downRegion capacity to org quota to project to namespaceRegionphysical VCFcapacityOrg quotacpu / mem /storage limitsProjectteam shareof the orgNamespaceclass limitsper workload
Capacity is not granted, it is partitioned. Each boundary can only hand down what it was given.
Worked example
A platform with a 240 GHz, 1.5 TB region carves three tenant organizations. Tenant A (production, regulated) gets a 100 GHz / 640 GB / 8 TB quota and Pattern 4 with a dedicated VRF. Tenants B and C (internal dev and test) share Pattern 2 with 60 GHz / 384 GB / 4 TB each. Inside Tenant B, four projects each draw a 12 GHz / 64 GB slice, leaving 12 GHz of org headroom so one team’s burst does not starve the others. Lease policy on dev namespaces is 14 days, prod is non-expiring. That single sheet of numbers is the difference between a platform you can defend and one that pages you at 2 a.m.

Describe the landing zone in code

A tenancy design that lives only in a runbook drifts. The vmware/vcfa Terraform provider lets you express the foundation of an All Apps landing zone declaratively: the organization, identity, the bootstrap access path, and the region quota that bounds it. Here is the spine of that foundation, current to provider version 1.1.x. This is provider-side (System) context.

terraform {
  required_providers {
    vcfa = {
      source  = "vmware/vcfa"
      version = "~> 1.1.0"
    }
  }
}

# Provider context = System (provider management)

resource "vcfa_org" "tenant_blue" {
  name         = "tenant-blue"
  display_name = "Blue Line of Business"
  description  = "Production tenant, Pattern 2 (VPC per LOB)"
  is_enabled   = true
}

# Capacity envelope: this is the line between
# self-service and delegated risk
resource "vcfa_org_region_quota" "tenant_blue" {
  org_id         = vcfa_org.tenant_blue.id
  region_id      = data.vcfa_region.region.id
  supervisor_ids = [data.vcfa_supervisor.supervisor.id]

  zone_resource_allocations {
    region_zone_id         = data.vcfa_region_zone.zone.id
    cpu_limit_mhz          = 60000
    memory_limit_mib       = 393216
  }

  region_storage_policy {
    region_storage_policy_id = data.vcfa_region_storage_policy.vsan.id
    storage_limit_mib        = 4194304
  }
}

Expected result on terraform apply: the organization is created and a region quota of roughly 60 GHz CPU, 384 GB memory and 4 TB storage is attached. A provider-side org and quota apply cleanly in seconds. The failure mode worth knowing: regional networking does not always apply through Terraform. If the provider gateway is backed by an Active/Active Tier-0, the create fails with unsupported HA mode, ACTIVE-ACTIVE, even though the same configuration succeeds in the UI. The workaround is to create that piece in the UI and terraform import it back into state.

Gotcha
The current vmware/vcfa provider has no first-class project or VPC resource. You can declare the organization, OIDC, local users, region quota, organization networking and content libraries, and you can create a Supervisor Namespace through the tenant-scoped vcfa_kubeconfig data source plus the Kubernetes provider. But projects and VPCs still come from the UI or API. Do not promise stakeholders a single clean Terraform module for the whole tenancy model in 2026. The honest design is Terraform for the foundation, API or UI for the rest, and a note in the repo that says exactly where the seam is.

Design patterns that hold up at scale

Identity and content, scoped down not up

Each organization should carry its own OIDC identity provider, mapped to that tenant’s groups, not a shared one you filter after the fact. VCF Automation 9.1 added two delegation features that change how you design at scale. Organization administrators can now delegate Supervisor Namespace creation to project administrators on a self-service basis, with granular governance over which regions, namespace classes, connectivity profiles, subnets and VPCs each project may use. And content libraries can now be scoped to specific projects, so a VM image is only visible to the users of one project rather than the whole organization. Both let you push consumption down to teams while keeping the guardrails at the top. Design for delegation from the start; retrofitting it means re-pinning every default.

Delegation with guardrails (9.1)Org admin sets the rails; project admin self-serves inside them1Org adminDefines allowed regions,classes, VPCs, subnets,project-scoped content2Project adminSelf-serves namespaceswithin the allowed set,no ticket requiredGuardrail setgovernance policy
VCF Automation 9.1 lets the org admin set the rails once, then project admins self-serve inside them without a ticket queue.

What breaks first

Three things fail before anything else at scale. First, the default project. Every organization gets one, and teams dump workloads into it because it is already there, which destroys your per-team accounting on day one. Name and scope real projects immediately and treat the default as a holding pen you keep empty. Second, quota starvation inside a busy organization, when projects have no sub-limits and the loudest team consumes the envelope. Set project-level caps with deliberate org headroom, as in the worked example above. Third, network HA mismatches, the Active/Active Tier-0 case, where your Terraform foundation applies but the regional networking step does not. Validate the Tier-0 HA mode against your automation path before you commit to a fully declarative build.

Disclaimer
Changing the tenancy model on a live VCF Automation instance, splitting a project into its own organization, or re-scoping quota under running workloads is a migration with downtime risk, not a settings change. Test the sequence in a non-production instance, confirm your backups, and stage namespace moves during a window. The patterns above are design guidance; validate them against the current Broadcom VCF 9.x design library and your own support entitlement before applying to production.

What I’d Do

Decide the tenant unit first. If two consumers can never share a network trust domain or a chargeback line, they are organizations, full stop. Everything else starts as projects inside one organization and gets promoted only when a real isolation requirement appears. For the network shape, default to Pattern 2 (VPC per line of business in a single Supervisor zone) and reserve dedicated VRFs and multi-region for tenants that have earned them. Put quota on every organization and a sub-cap on every project, with headroom you chose on purpose. Express the foundation in the vmware/vcfa provider, and be honest in the repo about where Terraform stops and the API takes over. The teams that get multi-tenancy wrong are not the ones who picked the wrong pattern. They are the ones who never picked one. If you do nothing else after reading this, write down, for your platform, which boundary means tenant and what its quota is. That one page prevents the eighteen-month problem this post opened with.

VCF Automation 9 Series · Part 24 of 41
« Previous: Part 23  |  VCF Automation Guide  |  Next: Part 25 »

Related reading on this site: Tenant Organizations (Part 7), Projects (Part 8), The vmware/vcfa Terraform Provider (Part 21), and VM Apps vs All Apps (Part 3).

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF Automation 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading