Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

NSX 9 Security Groups, Tags and Dynamic Membership: Policy That Follows the Workload (NSX Series, Part 13)

Groups and tags are what make NSX firewall rules describe workloads instead of IP addresses. Static vs dynamic membership, the tag and scope model, and effective members.

NSX Series · Part 13 of 30

TL;DR · Key Takeaways

  • A group collects workloads by static criteria (IP, segment, NSX object) or dynamic criteria (VM name, OS, or security tag). Dynamic groups update themselves as workloads change.
  • Tags are the recommended way to drive grouping in VCF. A tag is a value with an optional scope, so you can express things like scope=app, tag=web.
  • Tag a VM and it automatically joins the matching group and inherits its firewall policy. This is what makes security follow the workload instead of chasing IP addresses.
  • Always check effective members: the realized list of what a group actually contains. A rule is only as correct as the membership behind its group.
  • Build a deliberate tagging taxonomy early. Tag sprawl, with no scopes and no convention, is how a clean model turns into an unmaintainable one.
Who this is for: security and platform teams building maintainable micro-segmentation on NSX 9.  Prerequisites: a working Distributed Firewall (Part 12), which is what consumes these groups.

Here is the difference between micro-segmentation that lasts and micro-segmentation that rots. The version that rots is built on IP addresses: you write a rule allowing 10.0.2.0/24 to reach 10.0.5.40, and it is correct on the day you write it and slowly wrong forever after, as VMs are rebuilt, re-addressed, and moved. The version that lasts describes workloads by what they are, not where they are. You allow the group “web” to reach the group “app,” and when a new web server is born and tagged, it joins the group and inherits the policy with no rule change at all. Groups and tags are the machinery that makes that possible, and they are the difference between a firewall you maintain and one that maintains itself.

Static vs dynamic membership

A group is a named collection of workloads, and how a workload gets into it is the whole story. Static membership means you list the members explicitly: these IP addresses, these segments, these specific NSX objects. It is precise and predictable, and it is the right choice for things that genuinely do not change, a fixed set of physical appliance IPs, for example. Dynamic membership means you define criteria, and any workload that matches is included automatically and continuously. The criteria you can match on are the VM name, the VM operating system, and security tags, combined with logical operators. Dynamic is where the power is, because the group reflects the live state of your estate without anyone editing a list.

Two ways into a group STATIC (you list members) 10.0.9.10 segment: db-net Precise, but you maintain the list by hand. DYNAMIC (you set criteria) tag equals app=web Any VM tagged web joins automatically. Rebuild it, move it, scale it out: still a member. No list to maintain.
Static lists members; dynamic matches them. Dynamic is how a group keeps up with a changing estate.
MembershipCriteriaUse it for
StaticIP addresses, segments, specific NSX objects.Fixed things: physical appliances, set ranges.
Dynamic: tagSecurity tag (with optional scope).The default. Role, app tier, environment.
Dynamic: VM nameName matches a pattern.When naming is strict and consistent.
Dynamic: OSVM operating system name.OS-wide policy, for example patch sources.

Tags and scope

A security tag is a label you attach to a workload, and the smart part is the optional scope, which turns a flat label into a key-value pair. Instead of a pile of unrelated tags, you build a small, deliberate taxonomy: scope app with values like web, app, and db; scope env with values like prod, test, and dev; scope os with values like windows and linux. A single VM carries several of these at once, so a production web server is tagged app=web, env=prod, os=linux, and your groups slice the estate along whichever axis a rule needs. This is the part that makes the whole model scale: scopes give your tags structure, and structure is what keeps a few hundred workloads from becoming a tag swamp.

Scoped tags: structure, not a label pile web-vm-01 app = web env = prod os = linux group: all web tiers group: all production One VM, many memberships.Each rule slices along the scopeit actually cares about.
Scoped tags turn labels into a taxonomy. One workload belongs to several groups, each along a different axis.

The flywheel: tag, group, policy

Put the pieces together and you get a self-sustaining loop that is genuinely satisfying to operate. You write a DFW rule once, in terms of groups: allow group web to reach group app on 8443. The group web is dynamic, matching the tag app=web. From then on, the lifecycle takes care of itself. A new web server is provisioned, your automation tags it app=web, it joins the group, and the existing rule applies to it the instant it appears, with no human touching the firewall. Decommission a server and it drops out of the group and the policy with it. The firewall stops being a list of addresses you chase and becomes an expression of intent that the platform keeps true. This is the payoff of everything in Parts 12 and 13 together.

Tag once, policy applies itself 1. New VM taggedapp = web 2. Joins groupdynamic, by tag 3. DFW rule appliesno rule change 4. Securedautomatically The human action is tagging at provisioning. The firewall does the rest, every time.
The tag is the only manual step. Group membership and policy follow on their own.

Effective members: trust but verify

A dynamic group is only as correct as the membership it actually computes, and the place to check that is the effective members view, which shows you the realized list of VMs, IP addresses, segments, ports, and interfaces the group currently contains. When a rule does not behave, this is the first thing I look at, because the usual cause is not the rule, it is a membership that does not match what you assumed: a VM that was never tagged, a tag with a typo, a scope mismatch. The UI shows effective members, and you can pull the same data from the Policy API to feed checks into automation. Verify membership before you blame the rule.

# Apply a tag to a VM's logical port via the Policy API (NSX 9)
# Tags carry a tag value and an optional scope.
POST /policy/api/v1/infra/...   { "tags": [ { "scope": "app", "tag": "web" } ] }

# Read what a group actually contains (effective members)
GET /policy/api/v1/infra/domains/default/groups/<group-id>/members/virtual-machines

# If a VM is missing here, the tag or scope is wrong, not the rule.
In practice: the most common “the firewall is broken” ticket I get is a group whose effective members do not include the VM everyone assumed was tagged. Ninety seconds in the effective-members view, not an hour in the rule base, is where that one gets solved.

Tagging at scale, without the swamp

The model only stays clean if the tagging does. Two disciplines keep it healthy. First, agree a small taxonomy of scopes before anyone tags a workload, a handful of scopes like app, env, tier, and os, with a defined set of allowed values, and treat additions to it as a deliberate decision rather than a free-for-all. Second, tag at provisioning, automatically, as part of how a VM is built, so the tag exists from the first second and no one has to remember to add it later. Tags applied by hand after the fact are the ones that get forgotten, mistyped, and left stale. When teams skip both disciplines, you end up with hundreds of unscoped, inconsistent tags and groups nobody trusts, which is worse than no tags at all because it looks like security while quietly being noise.

A starter tagging taxonomy

Theory is easy; the hard part is committing to a small set of scopes and sticking to them. This is the starter taxonomy I hand teams to argue over. It is deliberately tiny, because four well-chosen scopes cover the overwhelming majority of real segmentation needs, and every scope you add is one more thing every workload has to be tagged for correctly. Start here, extend only with a clear reason.

ScopeExample valuesWhat it lets you express
appweb, app, dbThe application tier of a workload.
envprod, test, devKeep production and development apart.
tenantfinance, hr, retailWhich business unit or customer owns it.
oswindows, linuxOS-wide rules like patch or AV sources.

Putting groups to work in rules

Groups become powerful at the moment you stop thinking of them as lists and start thinking of them as the vocabulary of your security policy. A clean three-tier app rule reads almost like a sentence: allow the web group to reach the app group on the app port, allow the app group to reach the db group on the database port, and deny everything else between them. Every one of those nouns is a dynamic group, so the rules never mention an IP address and never need editing as the app scales. The same groups also make excellent Applied-To values, which closes the loop with Part 12: scope a rule’s enforcement to exactly the group it concerns, and a mistake can only ever touch those workloads.

There is a subtlety worth knowing. Nesting groups, a group whose members are other groups, is convenient but can hide what a policy really resolves to, so I keep nesting shallow and lean on the effective-members view whenever a nested group is involved. And combine criteria deliberately: a group matching env=prod AND app=web is a precise target, while one matching prod OR web is almost certainly broader than you intended. The logical operator between criteria is as consequential as the criteria themselves, and reading it wrong is a quiet way to over-scope a group.

My take: the best segmentation projects I have run spent more time on the tagging taxonomy than on the rules. Get the nouns right, a small set of scoped tags everyone agrees on, and the rules almost write themselves and stay correct for years. Get them wrong and no amount of clever rule-writing saves you.

What I’d Do

Build security on dynamic, tag-based groups by default, and reserve static membership for the genuinely fixed things. Design a small, scoped tagging taxonomy on day one and write it down, because it is far harder to impose structure on a thousand existing tags than to start with twenty good ones. Tag at provisioning through automation so the label is never an afterthought. And make the effective-members view a reflex: check it before you trust a rule and before you debug one. Do that and the firewall from Part 12 becomes a living expression of how your applications actually talk, one that keeps itself accurate as the estate changes underneath it. Next up is Part 14: the Gateway Firewall and perimeter policy, where security moves from the distributed kernel to the edge of the network. Is your tagging done at provisioning, or bolted on by hand afterwards?


Dynamic membership is only as good as your tag hygiene

Dynamic security groups are the feature that makes NSX policy follow the workload, and they are only as reliable as the tags underneath them. A group defined by a tag expression includes exactly the workloads carrying those tags, which is powerful right up until a workload loses a tag and silently drops out of the group. When that happens, the rules that referenced the group simply stop applying to that workload, with no error and no alarm, and you have a security gap that nothing announces. The most dangerous failures here are the quiet ones.

The defence is tag hygiene treated as a first-class operational concern. Apply tags at provisioning time and automate it, so membership never depends on someone remembering to tag a new VM by hand. Keep the tag taxonomy deliberate and scoped, the way the micro-segmentation Part lays out, so groups compose cleanly and a rule reads like a sentence rather than a guess. And lean away from IP-based grouping wherever you can, because the entire value of dynamic membership is that it survives the re-IP, the vMotion and the scale-out that break address-based rules. Tags are the contract your policy depends on; keep the contract clean and the policy keeps working without you watching it.

Combine criteria with care, or membership becomes unreadable

Security groups can be defined with rich, combined criteria, tags, segments, operating system, name patterns, and even nested groups, and that expressiveness is genuinely useful. It is also a trap if you reach for every dimension at once. A group whose membership depends on a deep combination of tags and nested groups and string matches can become something nobody can reason about, where you cannot say with confidence which workloads are in it without testing, and a group you cannot reason about is a security control you cannot trust.

The discipline is to keep group definitions legible. Lean on a clean, deliberate tag taxonomy as the primary mechanism, use additional criteria only when they genuinely add clarity, and avoid stacking nesting on nesting just because the product allows it. A group definition that a colleague can read and immediately understand which workloads it includes is worth far more than a clever one that technically captures the right set but only its author can decode. Effective membership you can predict at a glance is the whole point of dynamic grouping; do not trade that away for expressiveness you do not need.

References

NSX Series · Part 13 of 30
« Previous: Part 12  |  NSX Complete Guide  |  Next: Part 14 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

NSX 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading