TL;DR · Key Takeaways
- NAT, DHCP and the DNS forwarder are stateful gateway services. They run on the service router on the Edge, so the gateway that hosts them needs an Edge cluster.
- SNAT hides internal sources behind an external IP for outbound traffic; DNAT publishes an internal service on an external IP for inbound. Both are stateful and need active-standby.
- The trap: stateful SNAT/DNAT do not work on an active-active (ECMP) Tier-0. There you use reflexive (stateless) NAT, or you put the NAT on a Tier-1 instead.
- DHCP runs as a local server (NSX hands out leases) or a relay (NSX forwards to your existing DHCP). Pick relay when you already have IPAM you trust.
- The DNS forwarder gives clients a local listener IP and forwards queries upstream, with per-domain conditional forwarding when you need it.
Up to now the routing has been mostly distributed and almost free. These services are different. NAT, DHCP, and the DNS forwarder are the first features in this series that genuinely need the service router, which means they pull traffic to the Edge and they care about your HA mode in ways the distributed router never did. That is not a reason to avoid them, they are bread-and-butter, but it is a reason to understand where they run before you turn them on, because the most common surprise here is a NAT rule that quietly does nothing because the gateway it is on cannot support it.
NAT: SNAT, DNAT, and the active-active trap
NAT does two everyday jobs. SNAT, source NAT, rewrites the source address of outbound packets so a pool of internal VMs appears to the outside world as one or a few external addresses. DNAT, destination NAT, rewrites the destination of inbound packets so an external address maps to an internal service, which is how you publish a web server that lives on a private overlay segment. You also get No-SNAT and No-DNAT rules, which are exceptions: a way to say “translate everything except this range,” typically so internal routed traffic between your own subnets is not needlessly NATed.
Now the trap, and it is a good one because it ties directly to the Tier-0 HA decision from Part 9. Stateful SNAT and DNAT track connection state, and that state has to live in one place, so they are only supported on a gateway running in active-standby. If your Tier-0 is active-active for ECMP throughput, stateful NAT is off the table there, because two Edges forwarding independently cannot share that state and asymmetric paths would break it. Your options are to use reflexive NAT, a stateless translation that survives asymmetric paths, or, far more commonly, to do the NAT on a Tier-1 in active-standby and keep the Tier-0 active-active for raw throughput. That second pattern is what I reach for almost every time.
| NAT type | Does | Stateful? | HA mode |
|---|---|---|---|
| SNAT | Rewrites source on outbound. | Yes | Active-standby only. |
| DNAT | Rewrites destination on inbound. | Yes | Active-standby only. |
| Reflexive NAT | Stateless 1:1 translation. | No | For active-active Tier-0. |
| No-SNAT / No-DNAT | Exception: skip NAT for a range. | n/a | Either. |
DHCP: local server or relay
NSX can hand out IP addresses two ways, and the choice usually comes down to where you want IPAM to live. A local DHCP server means NSX itself owns the scope and leases addresses to VMs on a segment; it is self-contained and quick to stand up, ideal for overlay segments that have no business talking to your enterprise DHCP. A DHCP relay means NSX does not lease anything itself but forwards DHCP requests from the segment to your existing external DHCP servers, which keeps a single source of truth for addressing and is what I recommend whenever a team already runs DHCP and IPAM they trust. Both attach to a gateway or segment, and the local server, being stateful, lives on the SR like NAT does.
The DNS forwarder
The DNS forwarder gives a gateway a local listener IP that VMs point at for name resolution. It receives client queries and forwards them to upstream DNS servers, using its forwarder IP as the source toward those upstreams. The useful part is conditional forwarding: you can send queries for a specific domain to one set of resolvers and everything else to another, which is how you keep internal zones resolving against internal DNS while general lookups go to a public or corporate resolver. It is a small service, but it removes a dependency, the VMs talk to a local NSX address instead of reaching across the network to a distant resolver for every query, and it gives you a clean place to steer resolution per domain.
They all live on the service router
Tie these three back to Part 10 and the picture is consistent: NAT, the local DHCP server, and the DNS forwarder are stateful services, so they run on the service router, which means the gateway hosting them must have an Edge cluster attached. That is the legitimate reason to attach an Edge cluster to a Tier-1, the one I told you to demand a justification for. If a Tier-1 runs any of these services, the SR and the Edge cluster are not optional, they are the point. The discipline is simply to attach the Edge cluster to the gateways that actually run services and leave the pure east-west Tier-1s distributed, so you spend Edge capacity only where a service earns it.
When something here misbehaves, the Edge is where you confirm it. A couple of CLI checks tell you whether the service router is actually doing the translation or forwarding you configured.
# On the Edge node hosting the SR
get firewall # NAT and FW are processed here
get nat rules # confirm SNAT/DNAT rules are present and hit
get dns-forwarder # forwarder status and upstreams
get dhcp lease # active leases if running a local DHCP server
# If a NAT rule shows zero hits, re-check the gateway HA mode
# and that traffic is actually routed through this SR.
Local DHCP or relay, decided
The DHCP choice is less about technology and more about who owns addressing in your organisation. If a networking or platform team already runs DHCP and IPAM as the authoritative source, relay keeps that authority intact and avoids two systems disagreeing about who owns a lease. If a segment is genuinely self-contained, a lab overlay or a tenant network that should never touch enterprise addressing, the local server is simpler and removes an external dependency. The table below is the quick version I use when a team cannot decide.
| Consideration | Local DHCP server | DHCP relay |
|---|---|---|
| Source of truth | NSX owns the scope. | Your existing DHCP/IPAM. |
| External dependency | None. | Reachability to the DHCP servers. |
| Best for | Isolated overlays, labs, tenants. | Estates with trusted enterprise IPAM. |
| Runs on | The SR (stateful, needs an Edge). | Forwarding only; lighter footprint. |
Where to place these services
There is a design question that sits underneath all three services: which gateway tier should host them, Tier-0 or Tier-1? My default is to push services down to the Tier-1 wherever I can. Putting NAT, DHCP, and DNS on per-tenant Tier-1 gateways keeps each tenant’s services isolated, lets you run the Tier-0 active-active for throughput, and contains the blast radius of a service change to a single tenant rather than the shared north-south gateway. Reserve Tier-0 services for the genuinely shared, edge-of-network cases, for example a NAT that has to apply to everything leaving the estate. This is the same instinct as the routing parts: keep the shared Tier-0 lean and fast, and let the per-tenant Tier-1s carry the stateful, opinionated work.
The mistakes I see cluster around three things. Stateful NAT placed on an active-active gateway, which silently does nothing, is the big one. Overlapping DHCP scopes, where a new local server hands out addresses that collide with an existing range, is the second, and it produces intermittent, hard-to-trace connectivity loss. The third is forgetting that the DNS forwarder, NAT, and local DHCP all consume the SR, so a Tier-1 that picked up three services has quietly become an Edge-resource consumer that belongs in your sizing math. None of these are hard to avoid once you know to look for them, which is the whole point of understanding where these services run before you switch them on.
What I’d Do
Put stateful NAT on an active-standby Tier-1 and keep the Tier-0 active-active for throughput, which sidesteps the most common NAT-does-nothing surprise entirely. Use No-SNAT rules to keep your own internal routed traffic un-NATed rather than building elaborate exceptions later. Reach for DHCP relay when you already run trusted IPAM and a local DHCP server only for genuinely isolated overlays. Stand up the DNS forwarder to give clients a local listener and use conditional forwarding to keep internal zones internal. And remember the through-line of this part: every one of these services lives on the SR, so it is the honest reason a gateway needs an Edge cluster. Next up is Part 12: the Distributed Firewall, where micro-segmentation finally begins. Is your stateful NAT on a gateway that can actually support it?
These services pin workloads to the Edge
NAT, DHCP and the DNS forwarder share a property that shapes your design: they run as stateful services on the gateway service router, which lives on the Edge. The moment you enable any of them, the traffic that depends on them has to traverse the Edge, so a segment that was happily forwarding east-west on the hosts now has an Edge dependency for its address assignment or its name resolution or its outbound translation. That is fine when you plan for it and a surprise when you do not, because it quietly turns the Edge into part of the critical path for functions that feel like they should be everywhere.
The design implications follow directly. Size the Edge for the load these services add, not just for raw forwarding, because a busy DHCP scope or a heavily used NAT rule set consumes real Edge resources. Place the services deliberately, and monitor them as the dependencies they are, because a DNS forwarder or a DHCP server on the Edge failing is an application outage even though no workload moved. The mental model that keeps you out of trouble is to treat every gateway service as something you are pinning to the Edge on purpose, with a capacity cost and a failure mode you have accounted for, rather than a convenient checkbox you flipped without thinking about where it actually runs.
The DNS forwarder is a dependency worth watching
Of the gateway services, the DNS forwarder deserves special attention because of how much depends on it and how quietly it fails. Everything behind it relies on it for name resolution, so a forwarder that is mis-scoped, points at an unreachable upstream, or simply runs on an Edge that is having a bad day takes down name resolution for a whole swathe of workloads at once. The applications do not report a DNS problem; they report timeouts and half-broken behaviour that sends you looking in the wrong place.
Treat the forwarder as the shared dependency it is. Monitor it actively, confirm its upstreams are reachable and sensible, and design its placement and redundancy with the same care you would give any service that an entire tier leans on. When a swathe of workloads behind a gateway suddenly behaves strangely and nothing obvious changed, the DNS forwarder is worth checking early, because a name-resolution failure wears a lot of disguises before anyone names it.
References
- Configure an NSX NAT (Broadcom TechDocs, NSX 9.0)
- Tier-1 Gateway Services (Broadcom TechDocs, NSX 9.0)
- NSX 9 Tier-1 Gateways and East-West Routing (NSX Series, Part 10)



