Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

Cloud Zones, Regions and Placement in VCF Automation (VCF Automation Series, Part 10)

A cloud zone is where deploy a VM becomes this VM on that cluster. Regions and zones, the many-to-many with projects, placement policies, and capability and compute tags that steer where work lands.

VCF Automation Series · Part 10 of 30

TL;DR · Key Takeaways

  • A cloud zone is the compute target for a deployment: one or more clusters within a region, with resource pools, placement policy, and tags.
  • Cloud zones are region-specific and have a many-to-many relationship with projects. A project can hold up to 3000 zones; a single deployment can span at most 100.
  • Placement policy chooses the host inside a zone: Default spreads by availability, Spread by memory targets the most free memory.
  • Capability tags on zones and compute tags on resources filter placement to only what matches. Tags are how you express intent.
  • VCF Automation 9.1 adds Infrastructure Policies to govern placement across zones dynamically, for license optimization or compliance.
Who this is for: admins turning connected infrastructure into placement targets.  Prerequisites: validated cloud accounts from Part 9, with inventory collected.

A cloud zone is where deploy a VM becomes this VM, on that cluster. It is the platform’s answer to the question every deployment eventually asks: out of everything you connected, where exactly does this run? Get zones right and placement is a policy the platform enforces. Get them vague and every deployment becomes a gamble on which cluster it lands, which is fine until the one time it lands somewhere it should not.

Regions, zones, and projects

A cloud zone is one or more clusters that provide compute, memory, and storage, and it lives inside a region. The region is the geographic or logical grouping; the zone is the actual capacity. You assign zones to projects, and the relationship is many-to-many: a project can draw from several zones, and a zone can serve several projects. That flexibility is useful and also a rope to hang yourself with, because a zone shared across projects is shared capacity, and shared capacity needs limits and tags or it becomes a free-for-all.

Two numbers are worth committing to memory. A project can hold up to 3000 cloud zones, which is more than any sane design needs and tells you the limit is not your constraint. The one that does bite is the cap of 100 cloud zones in a single cloud template deployment. If a template tries to span more zones than that, it fails, so very large multi-zone deployments need to be designed within that ceiling rather than discovering it at apply time.

Region, zones, projects Region Zone Acluster(s) Zone Bcluster(s) up to 3000 zones per project,max 100 per deployment Project: dev (many-to-many) Project: prod (many-to-many)
Zones live in a region and attach to projects many-to-many. The 100-zones-per-deployment cap is the one that bites.

Placement policy: choosing the host

Once a deployment is bound to a zone, placement policy decides which host inside it gets the workload. Default distributes compute across clusters and hosts based on availability, which is the sensible everyday choice. Spread by memory provisions to the cluster or host with the most free memory, which suits memory-heavy workloads you want to keep from crowding. The policy is set on the zone, so it applies to everything that lands there, and choosing it deliberately is the difference between balanced clusters and a slow drift toward one hot host.

Placement policyWhat it doesReach for it when
DefaultDistributes by availability across clusters and hostsGeneral workloads, balanced clusters
Spread by memoryTargets the host or cluster with the most free memoryMemory-heavy workloads you want isolated
# Cloud zone configuration (example)
cloud_zone:
  name:            zone-prod-a
  region:          region-a
  placement_policy: default          # or 'spread_by_memory'
  resource_pools:  [rp-prod]         # which compute the zone offers
  capability_tags:                   # what this zone CAN satisfy
    - env:prod
    - compliance:pci
# A deployment lands here only if its compute tags match these capability tags.

Tags: how intent becomes placement

This is the lever that ties everything together. You put capability tags on a cloud zone to declare what it can satisfy, and compute tags on resources to filter placement to only matching capacity. Combined with the project constraints from Part 8, tags are how a deployment that says I need PCI capacity actually lands on the only zone tagged for it, and nowhere else. The pattern is consistent across the platform: tag the capability, then express the requirement, and let matching do the work instead of trusting a human to pick the right zone from a dropdown.

Tags match requirement to capacity Deployment needscompute tag: pci match no match Zone tagged compliance:pciworkload lands here Zone tagged env:devfiltered out
Capability tags on zones plus compute tags on requirements turn placement into matching, not guessing.
In practice: a deployment that cannot place is almost always a tag mismatch, not a capacity shortage. The requirement asks for a tag no zone carries, or a zone lost a tag in a redesign. Before you add hardware, diff the requirement’s tags against the zone’s capability tags.

Infrastructure Policies in 9.1

VCF Automation 9.1 adds Infrastructure Policies, which let administrators govern VM placement across zones dynamically rather than only through static tags and per-zone policy. The two motivations called out are license optimization, keeping workloads on the hosts where licensing is most efficient, and regulatory compliance, keeping regulated workloads on cleared capacity. If you are on 9.1, treat these as the higher-level governance layer above the zone-by-zone controls: tags and placement policy decide the mechanics, and Infrastructure Policies steer the fleet-wide intent. On 9.0 you achieve the same outcomes with disciplined tagging and constraints; 9.1 makes the intent first-class.

Worked example

You run two zones in region-a: zone-prod-a tagged env:prod and compliance:pci with Default placement, and zone-prod-b tagged env:prod with Spread by memory for a memory-hungry analytics workload. Both attach to the prod project. A standard app deploys to either by its env:prod tag and balances via Default; the analytics app carries a compute tag that only zone-prod-b satisfies, so it always lands where memory is freest; and a PCI workload carries compliance:pci, so it can only ever land on zone-prod-a. Three different placement outcomes from two zones, all driven by tags and policy, none requiring anyone to choose a cluster by hand.

Placement policy, DRS, and resource pools

Placement policy and vSphere share the job, and most placement surprises come from forgetting where one ends and the other begins. The policy you set on a cloud zone decides initial placement, where a workload first lands. Once it is running, vSphere DRS keeps balancing it by its own rules. The two are not enemies, but they can pull against each other. A zone set to Spread by memory pointed at a cluster with aggressive DRS will place on the freest-memory host and then watch DRS move it minutes later. That is not a fault; it means your policy chose the door and DRS chose the room.

Two ways to choose a host Default: spread evenly roughly equal load Spread by memory: freest host lands on freest
Default keeps hosts even; Spread by memory steers the workload to the host with the most headroom.

Resource pools are the other lever people underuse. A cloud zone is scoped to the resource pools you assign it, so the pool is your boundary between what the zone can touch and what it cannot, even within one cluster. If a single cluster serves both a production zone and a development zone, separate resource pools keep them from stealing each other’s reservations. Assign the pool deliberately, because a zone pointed at the cluster root sees everything, which is rarely what you want on shared hardware. The pool is where you draw the line that the zone then enforces.

And mind capacity, not just configuration. A zone that is perfectly tagged and policied still cannot place a workload it has no room for, and that failure reads like a placement error rather than a capacity one. When a deployment will not land and the tags match, check the free capacity in the zone’s resource pools before you suspect the policy. The order I work it is always the same: tags match, capacity present, then policy. The large majority of placement mysteries are solved by the first two checks, and reaching for the third before them is how an afternoon disappears.

My Take

Cloud zones are where placement stops being luck and becomes design. My recommendation: keep zones meaningful, one per real capacity boundary rather than a sprawl you cannot reason about; set placement policy on purpose, Default for general workloads and Spread by memory where it earns its keep; and lean hard on capability and compute tags so every deployment lands by matching, not by a person picking from a list. Mind the 100-zones-per-deployment ceiling when you design anything large. And if you are on 9.1, adopt Infrastructure Policies for the fleet-wide intent while keeping tags and policy as the mechanics underneath. Do this and where a workload runs becomes a property of your design, not a question someone answers nervously at deploy time.

Next we wire the building blocks templates reference: flavor mappings, image mappings, and network and storage profiles. For the constraints that pair with zone tags, revisit Part 8 on projects. Which placement policy do you default to, and have tags saved you from a bad landing? Tell me in the comments.

VCF Automation Series navigation:
Previous: Part 9, cloud accounts.  Next: Part 11, flavor and image mappings, network and storage profiles (coming soon).  Up: VCF Automation Guide (pillar).

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading