Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
, ,

Approval, Lease and Day-2 Policies in VCF Automation: Governing the Catalog (VCF Automation 9 Series, Part 16)

A self-service catalog without policies is unsupervised spending. Here is how approval, lease and day-2 action policies work in VCF Automation 9, how they combine when several match, and the one missing policy that fills your cluster with zombie VMs.

VCF Automation 9 Series · Part 16 of 41
TL;DR · Key Takeaways
  • Three policy types do the governing on the consumption side: approval (review before a request runs), lease (a deployment has a finite life), and day-2 actions (which actions a user may run on a live deployment).
  • Policies are scoped to the whole organization or selected projects, then narrowed with criteria. No matching policy means no constraint, which is exactly how clusters fill with deployments that never expire.
  • When several policies match one request, they combine: approvers are the union, the lease is the minimum term, and the most restrictive behavior wins. Plan for the combination, not the single policy.
  • Every policy is an object on the same Policy API (/policy/api/policies) keyed by a typeId. In VCF 9.1 the vmware/vcfa Terraform provider can manage them as policy-as-code.
  • My default on day one: a lease policy on every non-production project before a single catalog item ships. It is the cheapest insurance you will buy.

Who this is for: Cloud and platform admins and organization administrators who own governance on a VCF Automation catalog, plus project admins who get handed approver duty.

Prerequisites: A VCF Automation 9.x instance with at least one project and a published catalog item. Organization administrator rights to create policies. The catalog and sharing work from the earlier Parts already in place.

A self-service catalog without policies is not self-service. It is unsupervised spending with a nicer UI. The moment you share a template to a project, any member can deploy it, keep it forever, and run any action on it, including the destructive ones, unless something says otherwise. Policies are that something. They are the difference between a catalog you can hand to a tenant and a catalog you have to babysit.

This Part covers the three governance policies on the consumption side of VCF Automation 9 (the product formerly known as VMware Aria Automation, and before that vRealize Automation): approval, lease, and day-2 actions. I am writing against VCF 9.1, the current release. The mechanics are consistent: every policy has a type, a scope, optional criteria, and an enforcement behavior, and they all live on one Policy API. Get the model right and governance becomes declarative. Get it wrong and you find out at quota.

Where the three policies act in a deployment’s life

The three policies do not compete; they act at different moments. Approval acts before anything is provisioned. Lease acts across the whole life of the deployment and ends it. Day-2 actions govern what happens in between, while the deployment is live. Picture the timeline and the placement of each policy becomes obvious.

When each policy acts Approval before, lease across the whole life, day-2 in between 1RequestUser requestsa catalog item 2ApprovalHold for review(if a policy matches) 3ProvisionDeploymentcreated, live 4Day-2Allowed actionsonly 5Lease endExpire andreclaim (lease clock runs from step 3)
Approval gates the request; the lease clock starts at provision and ends the deployment; day-2 governs the middle.

Approval policies

An approval policy holds a request, either a new deployment or a day-2 action, until a designated approver reviews it. You decide the scope, who approves, and what triggers it. Approvers can be named users or groups, or a role such as Project Administrator that is resolved automatically within the policy scope. That role option is the one I reach for, because naming individuals turns every personnel change into a policy edit.

Scope, role and criteria

Every consumption policy shares the same three controls. Scope is the blast radius: the whole organization or selected projects. Role decides which actor the policy watches. Criteria narrow further by attributes such as project name, the catalog item, or request cost, so you can require approval only above a threshold. A good approval policy is specific: approval on production-tier requests over a set size, not approval on everything, because blanket approval trains people to rubber-stamp.

What an approval actually carries

An approval policy defines an approval mode (any one approver, or all of them), an auto-decision if no one acts in time, and an expiry on that auto-decision. Set the auto-decision to reject for anything that costs real money. An approval that silently auto-approves after a timeout is not governance; it is a delay.

Lease policies

A lease policy sets the maximum time a deployment can exist. When the lease expires, the deployment is destroyed and its resources are reclaimed, with no admin intervention. Three numbers define it: the maximum lease term a user can pick, a total maximum that caps repeated extensions, and a grace period before destruction. The lease is the single most effective control against sprawl, and it is the one most often skipped because nothing forces you to set it.

Gotcha · No lease policy means no expiry

If no lease policy matches a deployment, the deployment never expires. There is no hidden default. On every brownfield environment I review, the cluster that is mysteriously full of powered-on dev VMs from eight months ago is a project that was shared a catalog and never given a lease policy. The fix is one policy. The cost of skipping it is real capacity.

Worked example · A sane dev lease

For a non-production project I set a maximum lease term of 14 days, a total maximum of 30 days, and a 3-day grace. A developer who deploys gets up to 14 days, can extend, but the deployment cannot live past 30 days total. At expiry it enters a 3-day grace where it is marked for destruction before resources are reclaimed.

The math that matters: with 25 active developers each running 2 short-lived stacks, a 30-day cap means you never carry more than roughly 50 deployments of aged inventory instead of an unbounded pile. That is the difference between predictable capacity and a quarterly cleanup project.

Day-2 action policies

Day-2 action policies control which actions a group of users can run on a live deployment: power operations, resize, snapshot, reconfigure, delete. You build an allowlist of permitted actions per scope and role, which keeps users from triggering destructive or costly changes. The framing I use with teams: a deployment is not finished at provision time, it is finished when it is reclaimed, and everything in between is a day-2 action someone could run. Decide who can run which, deliberately.

The common mistake is leaving delete and resize open to all project members. Scope the expensive and irreversible actions to admins, leave power and reboot to everyone, and pair the policy with an approval policy on the actions that change cost. Day-2 and approval compose well: the day-2 policy decides what is offered, the approval policy decides whether running it needs a sign-off.

PolicyWhat it controlsAPI typeId
ApprovalHolds deploy and day-2 requests for reviewcom.vmware.policy.approval
LeaseMaximum lifetime before auto-destroycom.vmware.policy.deployment.lease
Day-2 actionsWhich actions users may run on a deploymentcom.vmware.policy.deployment.action
Deployment limitCaps concurrent deployments in scopecom.vmware.policy.deployment.limit
Content sharingWhat a project may request (see Part 14)com.vmware.policy.catalog.entitlement

How policies combine when several match

This is the part that surprises people, and it is where governance designs quietly go wrong. A single request can match more than one policy, for example an org-wide policy plus a project-specific one. VCF Automation does not pick a winner and ignore the rest. It combines them, and the combination leans restrictive.

For approvals, every matching policy is enforced and the approvers become the union of all of them. The auto-decision rejects if any matched policy says reject. The expiry collapses to the minimum number of days across the matched policies. For leases, the effective term is the shortest one that applies. The rule of thumb: assume the strictest matching policy is the one that governs, because in practice it is.

Two policies match: how they combine Union of approvers, minimum lease, most restrictive wins Org policyapprover: Finance · lease 30d Project policyapprover: Manager · lease 14d Combineunion / min /most restrictive Effectiveapprovers: Finance + Managerlease: 14 days (the shorter)
Both approvers are required and the shorter lease wins. Stacking policies tightens, it does not loosen.

My take: Build governance in two layers. A thin org-wide baseline (every deployment gets some lease, expensive requests get some approval), then project-level policies that tighten. Because the combination is always more restrictive, the baseline can never be undone by a project, only made stricter. That is a property worth designing around.

Policy scope and the silent gap

Scope narrows in three stages: organization, then projects, then criteria. The danger lives at the bottom of that funnel. If a deployment falls through every stage without a match, there is no policy, and no policy means no constraint. Approval that does not match is no approval. A lease that does not match is an immortal deployment. When you audit governance, do not only check that policies exist; check that nothing important slips past all of them.

The scope funnel and the gap What matches nothing is governed by nothing Organization scope Selected projects Criteria No match =no constraint
A request that matches no policy at any stage runs ungoverned. Audit for the gap, not just the policies.

Doing it as code

Clicking policies into the UI is fine for the first project. For consistency across instances, drive them through the Policy API or, in VCF 9.1, manage them as policy-as-code with the vmware/vcfa Terraform provider, which added support for approval, day-2, IaaS, and lease policies. Below is a lease policy created directly on the API. Every policy type uses the same endpoint and shape; only the typeId and the definition change. Token acquisition gets its own Part later in the series; here a bearer token is assumed.

# Create a lease policy scoped to one project.
curl -s -k -X POST 
  'https://automation.example.local/policy/api/policies' 
  -H 'Authorization: Bearer ACCESS_TOKEN' 
  -H 'Content-Type: application/json' 
  -d '{
        "name": "lease-dev-30d-cap",
        "typeId": "com.vmware.policy.deployment.lease",
        "enforcementType": "HARD",
        "projectId": "PROJECT_ID",
        "definition": {
          "leaseGrace": 3,
          "leaseTermMax": 14,
          "leaseTotalTermMax": 30
        }
      }'

# Expected: HTTP 201 with the policy id.
# Swap typeId + definition to create approval or day-2 policies
# on the same endpoint. Common failure: a 201 that still does
# nothing because the scope/criteria never match real requests.

Disclaimer: Lease policies destroy deployments and reclaim resources at expiry. Test new lease and day-2 policies in a non-production project, confirm the scope and criteria match real requests, and warn tenants before you apply a lease to existing deployments. A misscoped lease can reclaim something a team still needs.

A first-week governance baseline

When I stand up governance on a new VCF Automation instance, I do not try to model every rule on day one. I lay a thin baseline and tighten later. The first pass is deliberately small, because an over-engineered policy set is as hard to reason about as none at all, and the combination logic means I can always make it stricter per project without unpicking the baseline.

Concretely, week one is four moves. First, an organization-wide lease policy so nothing is immortal, generous enough that it never surprises a team but tight enough to catch the forgotten stack. Second, a day-2 action policy that scopes delete, resize, and snapshot to administrators while leaving power and reboot open to all project members. Third, an approval policy that triggers only on high-cost or production-tier requests, with the auto-decision set to reject so a stalled approval fails safe instead of leaking spend. Fourth, a short audit pass that deploys a throwaway item as an ordinary project member and confirms the lease attaches and the restricted day-2 actions are hidden.

That audit step is the one teams skip and the one that catches the silent gap. A policy that exists but never matches looks identical to working governance until the day a deployment slips through every scope and runs unbounded. Deploying as a real user, not an admin, is the only test that proves the policies bind to the people they are meant to bind to. Once that baseline holds, layering project-specific approval thresholds and shorter leases is a small, safe change rather than a redesign.


The Bottom Line

If you do one thing, set a lease policy on every non-production project before the catalog opens, because the absence of a lease is the failure that costs the most and shows up the latest. Add a thin org-wide approval policy for high-cost requests, scope day-2 actions so delete and resize are not open to everyone, and let project-level policies tighten from there. I would not lean on approval policies as the primary control, because approval fatigue makes them theater within a month; leases and day-2 allowlists do the quiet work without a human in the loop. Validate first that your scopes and criteria actually match real requests, since a policy that matches nothing is the most expensive kind: it looks like governance and enforces nothing.

Pick your busiest non-production project this week, add a lease policy, and request a deployment to watch the lease clock start. If anything in your estate has lived past a sensible lease, you have just found your first cleanup, and your governance baseline.

VCF Automation 9 Series · Part 16 of 41
« Previous: Part 15  |  VCF Automation Guide  |  Next: Part 17 »

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

VCF Automation 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading