TL;DR · Key Takeaways
- NSX performance is a design-time decision. The Edge form factor you pick sets your north-south ceiling, and you cannot tune your way past a ceiling you chose too low.
- The distributed firewall scales the opposite way to the Edge. It runs in every hypervisor, so east-west firewall capacity grows as you add hosts. There is no central firewall chokepoint to saturate.
- Bare-metal Edges are in a different league for raw throughput. A bare-metal Edge with 4×100 Gbps interfaces can push up to around 388 Gbps north-south, and an 8-node cluster up to roughly 3 Tbps. VM Edges are right for most estates, bare-metal for the heavy hitters.
- Both VM and bare-metal Edges use Intel DPDK for fast-path packet processing, and EDP Standard is the default host-switch mode in NSX 9. These accelerations are why software networking keeps up with the wire.
- Size for the failure case and for services, not just steady state. An Edge running NAT, load balancing, and VPN has a lower effective ceiling than a bare forwarder, and N+1 means an upgrade or failure does not brown you out.
A bare-metal NSX Edge with four 100 Gbps interfaces can move about 388 Gbps of north-south traffic, roughly 97 percent of line rate, and a cluster of eight of them tops out near 3 Tbps. Those numbers sound like bragging until you realize the more useful fact hiding inside them: NSX performance is almost never a tuning problem in production. It is a sizing decision you made, well or badly, months earlier. The teams that call me about throughput rarely have a knob left to turn. They have a form factor that was wrong for the workload, chosen before anyone measured the workload.
East-west and north-south scale in opposite directions
The single most important idea in NSX sizing is that your two main traffic directions have completely different scaling models, and conflating them is how people both over-build and under-build at the same time. East-west traffic, the lateral flows between workloads, is filtered by the distributed firewall, which runs inside every hypervisor on every host. Add a host, and you add DFW capacity. There is no central appliance for east-west enforcement to saturate, which is the whole architectural advantage of the distributed firewall. North-south traffic, the flows entering and leaving through gateways, runs through the Edge nodes. That path is centralized by design, so it has a ceiling set by how many and what kind of Edges you deployed.
So the rule of thumb writes itself: east-west scales with your compute, north-south scales with your Edge. When you size NSX, you spend almost no anxiety on distributed firewall throughput, because it grows for free as the cluster grows, and almost all of it on the Edge, because that is the part with a hard ceiling and a queue behind it. I have watched teams agonize over DFW performance, which takes care of itself, while under-provisioning the Edge, which does not.
Edge form factor sets the ceiling
VM Edges
VM Edge nodes are the right answer for most environments. They are flexible, they live on your existing hosts, and they come in form factors sized by CPU and memory. With DPDK fast-path acceleration they handle the north-south needs of the typical enterprise comfortably. Their limit shows up at the high end of bandwidth and when you stack stateful services on them, because every service you run on an Edge consumes cycles that would otherwise go to forwarding.
Bare-metal Edges
Bare-metal Edges are a different class of machine. With high-speed interfaces they reach throughput a VM Edge cannot approach: up to around 388 Gbps north-south on a single node with 4×100 Gbps NICs, and roughly 3 Tbps across an eight-node cluster. They also carry higher upper limits for stateful services like load balancing, VPN, and NAT, and they are the recommended choice for overlay-to-VLAN Layer 2 bridging at high data rates, where VM Edge bridge instances become a bottleneck. The general guidance I follow: if sustained north-south demand is comfortably above 10 Gbps and heading toward tens of Gbps, or you are running heavy Edge services, bare-metal earns its cost. Below that, VM Edges are simpler and cheaper and entirely adequate.
| Dimension | VM Edge | Bare-metal Edge |
|---|---|---|
| Raw N-S throughput | Good, into low tens of Gbps | Up to ~388 Gbps per node |
| Stateful service ceiling | Lower | Higher (LB, VPN, NAT) |
| L2 bridging at scale | Can bottleneck | Recommended |
| Fast path | Intel DPDK | Intel DPDK |
| Pick it when | Most estates, under ~10 Gbps sustained | Tens of Gbps or heavy services |
EDP and DPDK: why software keeps up with the wire
The reason any of these numbers are possible is the fast-path data plane. Both VM and bare-metal Edges use Intel DPDK, which bypasses the normal kernel networking stack and processes packets in user space on dedicated cores, so forwarding does not pay the per-packet overhead that would otherwise cap a software router at a fraction of line rate. On the host side, EDP Standard (Enhanced Data Path) is the default host-switch mode for new NSX 9 installs, and it applies the same philosophy to host transport: a poll-mode, optimized data path that lifts throughput substantially over the legacy mode. I dug into how EDP delivers its boost in a separate explainer on the 3x boost.
The design consequence is that EDP wants dedicated CPU cores to do its job. That is the trade: you give the data path cores, and in return software networking runs at hardware-ish speeds. Size your hosts and Edges with that core budget in mind, because starving the fast path of cores is a quiet way to leave most of your throughput on the table while every spec sheet says you should have more.
Worked example
Peak north-south demand measured at 22 Gbps, with NAT and a modest load balancer in the path. A single large VM Edge that forwards at, say, 18 to 20 Gbps bare will not hold 22 Gbps once services eat into it, so two active-active ECMP VM Edges look right at first. But apply the upgrade lesson from Part 20: take one Edge down for a serial upgrade and you are at one node against a 22 Gbps peak, which browns out. So the real answer is three VM Edges (N+1), giving headroom at peak and during a node upgrade, or a pair of bare-metal Edges if you expect this to grow past the VM comfort zone. Notice the deciding factor was not the steady-state number. It was the services overhead and the failure case.
What I’d Do
Measure before you size, and size the Edge, not the firewall. The distributed firewall scales with your hosts, so east-west capacity mostly takes care of itself; put your engineering attention on the north-south path, where the Edge sets a hard ceiling. Pick VM Edges for the common case and bare-metal when sustained demand climbs into tens of Gbps or you are running heavy stateful services or high-rate L2 bridging. Give EDP and DPDK the dedicated cores they need, because starved fast paths quietly waste the throughput you paid for. And size against three layers, not one: measured peak, plus the overhead of the services you actually run, plus N+1 so a failure or an upgrade does not become an incident. Do that and NSX performance stops being a thing you firefight and becomes a thing you decided correctly once. Have you sized for your steady state, or for your worst Tuesday with a node down?
DFW scale is about the rule base, not the throughput
East-west firewall throughput scales with your hosts, so you almost never worry about it, but the distributed firewall has a second scale dimension that does need attention: the size and shape of the rule base itself. Thousands of rules across many sections do not slow down forwarding, but they do cost management and realization time, they make review and audit harder, and they raise the odds that a stale or overly permissive rule is hiding in the pile. The DFW scales beautifully as a data plane and degrades as a management problem, and the management problem is the one that bites mature estates.
The discipline that keeps it healthy is the same lean-rule-base hygiene the security Parts return to. Use the firewall rule analysis capability to find duplicates, shadowed rules and overly permissive entries, retire what no longer matches traffic, and resist the temptation to let the rule base grow monotonically because deleting feels risky. Sizing the distributed firewall, in other words, is less about capacity planning and more about housekeeping, and a team that treats its rule base as something to curate rather than accumulate keeps both its security posture and its operability in good shape.
EDP wants cores, and starving it wastes throughput
Enhanced Data Path delivers its performance by running an optimized, poll-mode data path on dedicated CPU cores, and that dedication is the deal you are signing up for. Give the data path the cores it needs and software networking runs at near-hardware speeds; starve it by under-provisioning the host or letting other workloads contend for those cores and you quietly leave a large fraction of your throughput on the table, while every spec sheet insists you should have more. The frustrating part is that this kind of shortfall does not announce itself as an error; it just shows up as performance that never reaches what the sizing promised.
So treat the EDP core budget as a first-class part of host sizing, not an afterthought. Account for the cores the data path reserves when you plan host capacity, keep them genuinely dedicated, and verify the host is actually running the EDP mode you intended rather than silently falling back. The throughput numbers in the rest of this Part assume the fast path has what it needs; honour that assumption in the host design and the numbers hold, ignore it and they quietly do not. Fast software networking is not free, it is paid for in cores, and the bill comes due whether or not you budgeted for it.
Right-size; do not over-build
The flip side of under-provisioning is the quieter waste of over-building, and it is worth naming because the fear of a brownout pushes teams toward it. Bare-metal Edges everywhere, sized for throughput the workload will never approach, is money and operational complexity spent on headroom that sits idle. The goal is not maximum capacity; it is capacity matched to the measured need with sensible margin, which for most estates means VM Edges sized for the real peak and bare-metal reserved for the genuinely heavy north-south or service-intensive cases.
The way you avoid both failure modes is the same: measure first. A real peak from the existing environment, or from a representative test that includes the services you will actually run, tells you what to size for far better than instinct or vendor maximums. Size to that number plus a deliberate margin for growth and failure, and you neither brown out under load nor pay for a fleet of idle capacity. Sizing is an evidence problem, and the evidence is cheap to gather compared to the cost of getting the answer wrong in either direction.
References
- VMware: NSX Bare Metal Edge Performance
- VMware NSX TechZone: Bare Metal Edge Performance Resource
- Broadcom TechDocs: NSX Advanced Network Management (VCF 9)



