Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

NSX 9 Multi-Tenancy: Projects vs VPCs and the Design That Holds Up (NSX Series, Part 22)

NSX 9 gives you two layers of tenancy: Projects for infra teams and VPCs for self-service consumers. Here is how they differ, who owns the Tier-0, the centralized vs distributed Transit Gateway call, and the Federation limit that catches people out.

NSX Series · Part 22 of 30

TL;DR · Key Takeaways

  • NSX 9 has two tenancy layers, and they are not interchangeable. A Project is a tenant slice for an infra or platform team that wants real networking control. A VPC is a self-service, public-cloud-style consumption model that lives inside a Project.
  • Tenants never own the Tier-0 or the Edge cluster. Those stay in the default space and get allocated down. A Project can own Tier-1 gateways; a VPC abstracts routing away entirely behind a Transit Gateway.
  • The biggest VCF 9 design call is centralized vs distributed Transit Gateway. Distributed TGW gives you east-west routing between VPCs with no Edge VMs in the path, which changes your Edge sizing math completely.
  • Multi-tenancy is not supported with NSX Federation. If your design assumes both, one of them has to go. Find this out in the workshop, not in the build.
  • Everything here is Policy API and hierarchical API only. The Manager API is gone in NSX 9, so any automation you carry over from NSX-T 3.x has to be rewritten against /orgs/default/projects/....
Who this is for: Network and security architects, VCF platform teams, and consultants carving one NSX deployment into tenants.  Prerequisites: You should already understand NSX 9 Tier-0 and Tier-1 gateways, segments, and the distributed firewall. If not, read the routing and DFW parts of this series first.

A platform owner asked me last year to “give each business unit its own NSX.” Eight business units, one VCF instance, and a hard rule that the storage team must never see the trading team’s firewall rules. The instinct in the room was to build eight NSX Managers. That would have been a budget fire and an operational nightmare. NSX 9 already solves this with multi-tenancy built into the Policy data model, but only if you pick the right layer for each tenant. Get that choice wrong and you either strangle a developer team with infra-grade controls or hand an infra team a consumption model that cannot do what they need.

Two layers of tenancy, and why the distinction matters

NSX 9 isolates tenants inside a single deployment through the Policy hierarchy, not through separate appliances. There are two nested constructs. Projects came first (API-only in NSX 4.0.1, full UI in 4.1), and VPCs landed in 4.1.1 and have matured into a first-class capability in VCF 9. By NSX 9 this is not an edge feature you bolt on. It is how the data model is shaped.

Where tenancy lives in the Policy data model Provider owns the top; tenants own what hangs below the org /infra (Default space) Enterprise Admin. Tier-0 gateways, Edge clusters, transport zones, shared groups and segments live here. /orgs/default multi-tenancy objects Project: trading own Tier-1, segments, groups, DFW VPC: payments-app self-service subnets, behind Transit GW Project: storage isolated from trading by default VPC: backup /orgs/default/projects/storage/vpcs/backup T0 allocated down to projects
Diagram 1: The default space stays with the provider. Projects and the VPCs inside them hang off /orgs/default.

What a Project actually is

A Project is a tenant that wants to do networking. Inside a Project, a Project Admin can create Tier-1 gateways, segments, groups, DFW and gateway firewall policies, DHCP, and VPN on the project Tier-1s. The objects live under /orgs/default/projects/<project-id>/infra and are invisible to other projects. This is the right fit when the tenant is itself a technical team that understands routing and wants to define its own topology. Think of an internal platform team, a managed-service customer, or a security zone with its own admins.

What a VPC actually is

A VPC is a self-contained private network inside a Project, built for people who do not want to think about NSX at all. Subnets, basic services, and a Transit Gateway uplink, consumed the way you would consume a VPC in AWS. The application owner asks for a subnet and gets one. They never touch a Tier-1, a BGP neighbor, or a transport zone. VPCs live under /orgs/default/projects/<project-id>/vpcs/<vpc-id> and can only exist inside a Project. You cannot create a VPC in the default space.

DimensionProjectVPC
ConsumerInfra / platform / network teamApp owner, DevOps, self-service
Routing controlOwns its Tier-1 gatewaysAbstracted behind Transit Gateway
Mental modelA scoped NSX of your ownA public-cloud VPC
Lives under/orgs/default/projects/<id>…/projects/<id>/vpcs/<id>
Can create Tier-0 / Edge clusterNo (allocated from default)No
Best whenTenant is a technical teamTenant just wants subnets fast
In practice: The most common mistake I see is reaching for VPCs because they sound modern, then discovering the tenant actually needed VRF route filtering or a custom Tier-1 service. If a tenant has a network engineer who will ask about BGP, give them a Project. VPCs are for people who never want to learn what a Tier-1 is.

Who owns what: the provider and tenant split

This is the line that decides your whole design, so be precise about it. Tier-0 gateways and Edge clusters are always owned by the default space. You cannot create a Tier-0 or an Edge cluster inside a Project, full stop. What you can do is allocate an existing Tier-0 to a Project so the tenant’s Tier-1s and VPCs can hang off it for north-south. Several projects can share the same Tier-0 or a Tier-0 VRF, which is exactly why route filtering inside projects exists: to keep one tenant’s routes from leaking into another’s through the shared uplink.

A Project can own Tier-1 gateways, and it must, because a Project cannot borrow a Tier-1 from the default space. A VPC goes a step further and hides routing behind the Transit Gateway, so the VPC consumer never creates a gateway object at all. The provider keeps the parts that carry blast radius (the Edge, the BGP peering, the physical uplinks) and tenants get the parts that are safe to self-serve.

My take: Treat the Tier-0 and the Edge cluster as shared infrastructure with its own change control, separate from any tenant. The day a tenant admin can affect the Edge BGP session is the day your multi-tenancy isolation is a fiction.

The Transit Gateway decision: centralized vs distributed

The Transit Gateway (TGW) is the hub that connects VPCs to each other and to the outside world. In VCF 9 you choose how it is realized, and this is the single most consequential VPC design decision because it changes where packets are switched and how many Edge VMs you need.

Centralized vs distributed Transit Gateway Same logical hub, very different packet path Centralized TGW east-west hairpins through Edge VMs VPC A subnet VPC B subnet Edge cluster TGW realized here Distributed TGW east-west routed in the hypervisor, no Edge VM VPC A subnet VPC B subnet direct Edge still needed for north-south and stateful services, not for E-W
Diagram 2: Centralized TGW pulls inter-VPC traffic through Edge VMs. Distributed TGW routes it in the hypervisor.

Centralized Transit Gateway

The centralized model realizes the TGW on the Edge cluster. Every inter-VPC flow and every flow needing stateful services rides through the Edge. It is the simpler model to reason about and the one you want when you need centralized services on that traffic. The cost is real: east-west between two VPCs on the same hosts hairpins out to an Edge VM and back, and that Edge throughput becomes your inter-VPC ceiling. Size for it.

Distributed Transit Gateway

The distributed model, added in VCF 9.0, routes inter-VPC traffic directly in the hypervisor data plane with no Edge VM in the east-west path. Two workloads in different VPCs on the same host talk through the local forwarding stack. You still need an Edge for north-south and for stateful services like the gateway firewall, but the inter-VPC volume no longer lands on it. For dense multi-VPC environments this is the difference between a two-node Edge cluster and a much larger one.

Worked example

Say you have 30 VPCs and your monitoring shows 40 Gbps of steady east-west between them, plus 10 Gbps north-south. With a centralized TGW, all 50 Gbps crosses the Edge, so two large Edge VMs at roughly 25 Gbps each leave you with zero headroom on day one. Move to a distributed TGW and only the 10 Gbps north-south needs the Edge, which a single appropriately sized Edge handles with room to spare. Same workload, less than half the Edge footprint. That is the sizing conversation to have before you commit to centralized.

VPC connectivity profiles and the IP planning that bites

VPCs are configured through a Connectivity Profile, which is where the provider centralizes the address space and outbound behavior so tenants do not have to. There are three IP pools you need to keep straight, and mixing them up is where new VPC deployments go wrong.

Three address pools in a VPC profile Private stays in, public is routable out, NAT bridges them Private subnets Routed only inside the VPC. Not reachable from outside. Private TGW blocks Scoped to the transit gateway, not advertised northbound. Outbound NAT one public IP per VPC, auto-reserved by NSX External IP block Public, routed range. Must not overlap. Source for public subnets and the NAT IP each VPC silently consumes.
Diagram 3: Private subnets stay local, the external IP block is your routable public space, and default outbound NAT quietly draws a public IP for every VPC.

Private subnets are routed only within the VPC and never reachable from outside. The private Transit Gateway IP blocks are scoped to the TGW within its project and are not advertised to the northbound router. The external IP block is your public, routable space, and public subnets are carved from it, must not overlap, and must be routed on your physical fabric. Here is the part people miss: when Default Outbound NAT is set in the Connectivity Profile, NSX automatically reserves a subnet from the public block and pulls one outbound NAT IP for every VPC. Spin up 50 VPCs and you have silently consumed 50 public IPs before a single workload is exposed.

Gotcha

Size the external IP block for the number of VPCs you will ever run, not the number you launch with. Default outbound NAT takes one public IP per VPC automatically, so an undersized public block stops new VPC creation cold once it is exhausted, and the error does not point at the IP block. If you do not need internet egress per VPC, turn default outbound NAT off in the profile and hand out NAT deliberately.

The limits that change the design

Two constraints reshape architectures, so surface them early.

Gotcha

Multi-tenancy is not supported in an NSX Federation environment. If your target design is multi-site with a Global Manager and per-tenant Projects, that combination does not exist today. You pick Federation for stretched multi-site, or you pick Projects and VPCs for tenancy on a single deployment. I have seen this assumption survive an entire design phase because nobody validated it. Validate it on day one.

The second is quotas. Each Project can have quotas that cap how many objects of a given type its users create, segments, Tier-1s, NAT rules, and so on. This is your guardrail against one noisy tenant exhausting shared scale. Project users can monitor quota status and see alarms as they approach a limit. Set quotas when you create the Project, because retrofitting them after a tenant has sprawled is a negotiation, not a config change.

One more operational note that follows from NSX 9 removing the Manager API: everything tenant-facing is Policy API and hierarchical API. The hierarchical API lets you push an entire Project intent, gateways, segments, groups, and firewall rules, in a single tree-structured call, which is how you should be templating tenant onboarding. The Terraform provider exposes the same model through resources like nsxt_policy_project, so a new tenant becomes a module, not a runbook.

# Create a Project with the Policy hierarchical API (conceptual)
PATCH https://<nsx-mgr>/policy/api/v1/orgs/default/projects/trading
{
  "resource_type": "Project",
  "id": "trading",
  "tier_0s": [ "/infra/tier-0s/edge-t0-a" ],
  "site_infos": [ ... ]
}

# Then a VPC under that Project
PATCH .../orgs/default/projects/trading/vpcs/payments-app

How I decide: Project, VPC, or both

Choosing the right tenancy construct Does the tenant need to be isolated from others? no → default space Will the tenant manage its own routing and firewall? yes no, wants self-service Use a Project own Tier-1, DFW, route filtering Use a VPC in a Project, behind Transit GW Both: Project for the team, VPCs for its apps
Diagram 4: A simple gate. Isolation decides Project vs default; control vs self-service decides Project vs VPC.

My default pattern for a real tenant is a Project that owns its Tier-1s and DFW, with VPCs created inside it for the application teams who just want subnets. That gives the tenant’s network owner real control at the Project level and gives their developers a cloud-like experience at the VPC level, all under one set of quotas and one shared, provider-owned Tier-0. Read the NSX in VCF 9 overview for how this slots into the wider platform, and the VKS networking part for how Kubernetes consumes these VPCs. The tenant isolation itself is enforced by the distributed firewall, which each Project controls within its own scope.

Disclaimer: Validate the supported configuration maximums and the Project, VPC, and quota interoperability against your exact VCF 9.x build before committing a tenancy model to production. Confirm external IP block sizing, the Federation constraint, and Edge sizing for your chosen Transit Gateway mode in a non-production deployment first.

What I’d Do

Start by sorting your tenants into two buckets: teams that will manage networking, and teams that just want to ship apps. The first bucket gets Projects, the second gets VPCs inside a shared or per-team Project. Keep the Tier-0 and Edge firmly with the provider, default to a distributed Transit Gateway unless you have a specific reason to centralize, size the external IP block for your worst-case VPC count, and decide the Federation question before anyone draws a topology. Do that and NSX 9 multi-tenancy holds up under real load instead of becoming the thing you quietly work around. Next up in the series: NSX Federation and multi-site, which is the path you take when tenancy is not your problem but geography is.


NSX Series navigation:
← Previous: Part 21: NSX 9 Micro-segmentation Design
→ Next: Part 23: NSX Federation and Multi-Site (coming soon)
↑ Series hub: NSX Complete Guide

References

Broadcom TechDocs: NSX Projects (VCF 9.0)
Broadcom TechDocs: Virtual Private Cloud in NSX
VCF Blog: VPC Distributed Network Connectivity, No NSX Edge VMs

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

NSX 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading