Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

NSX 9 VPN: IPSec and L2VPN, Policy-Based vs Route-Based (NSX Series, Part 17)

NSX runs site-to-site IPSec and L2VPN on the Edge. Policy-based vs route-based IPSec, when each fits, and how L2VPN stretches Layer 2 across sites for migration and DR.

NSX Series · Part 17 of 30

TL;DR · Key Takeaways

  • NSX runs site-to-site IPSec VPN and L2VPN on the Edge service router, on a Tier-0 or Tier-1 gateway. Like all SR services, they need an Edge cluster.
  • Policy-based IPSec defines the interesting traffic as local and remote subnets. Route-based IPSec builds a virtual tunnel interface (VTI) and routes over it, with static routes or BGP.
  • Route-based is the better default: it scales, supports dynamic routing, and survives subnet changes without editing the tunnel. Reach for policy-based only when the far end demands it.
  • L2VPN stretches Layer 2 across sites, extending a segment over the WAN. It rides on a route-based IPSec tunnel and is the tool for cross-site migration and DR.
  • In an NSX Project (multi-tenancy), a Tier-1 supports one IPSec and one L2VPN service, and project Tier-1s use static routes, not VTI BGP.
Who this is for: network architects connecting NSX 9 to remote sites and clouds.  Prerequisites: Tier-0/Tier-1 gateways with an Edge cluster (Parts 7, 9, 10), since VPN runs on the SR.

VPN is the part of NSX that reaches outside the data center, to a branch, a partner, another cloud, or a second site. It is also where two worlds meet: your NSX configuration on one end and someone else’s firewall or router on the other, and that far end is frequently not yours to change. That constraint shapes every VPN design, because you are negotiating a shared secret and a set of parameters with a device you do not control. NSX gives you flexible, capable VPN on the Edge, but the wins here come from choosing the right model and matching parameters cleanly with the far end, not from anything exotic. So this part is about the two IPSec models, when to use each, and what L2VPN is genuinely for.

IPSec on the Edge

An NSX IPSec VPN service runs on a Tier-0 or Tier-1 gateway and builds an encrypted, authenticated tunnel across an untrusted network to a remote endpoint. Because it is a stateful service, it lives on the service router, which means the gateway needs an Edge cluster, the same rule that governed NAT and the gateway firewall. The session is defined by a stack of profiles that both ends must agree on: an IKE profile for the key-exchange parameters, an IPSec profile for the tunnel encryption, and a DPD profile for dead-peer detection. The single most common reason a tunnel will not come up is a mismatch in one of these between your end and the far end, so the practical work of building an IPSec VPN is largely the discipline of agreeing every parameter with the other side before you start.

A site-to-site IPSec tunnel NSX Edge (T0/T1)local segments encrypted tunnel over the internet / WAN remote site routernot yours to change IKE, IPSec, and DPD profiles must match on both ends. A mismatch is the usual reason a tunnel will not establish.
The Edge terminates the tunnel; the far end is usually a device you do not own. Agree every parameter first.

Policy-based vs route-based

The fork that shapes the whole design is how the tunnel decides which traffic to carry. Policy-based IPSec defines the interesting traffic explicitly, as a set of local and remote subnet pairs, and the tunnel encrypts traffic matching those selectors. It is simple and it interoperates with almost anything, but it is rigid: every time a subnet on either side changes, you edit the policy, and large numbers of subnet pairs get unwieldy. Route-based IPSec instead creates a virtual tunnel interface (VTI), a logical interface you route traffic into, and then you decide what crosses the tunnel using ordinary routing, static routes or BGP. That indirection is the whole advantage: the tunnel does not care about subnets, the routing table does, so you add or remove networks by changing routes, not by editing the VPN, and you can run dynamic routing across the tunnel for resilience.

Two ways to decide what the tunnel carries POLICY-BASED (subnet selectors) local 10.0.0.0/24 to remote 192.168.0.0/24 Edit the policy whenever a subnet changes. Simple, interoperable, rigid. ROUTE-BASED (VTI + routing) VTI (tunnel iface) static or BGP Add networks by changing routes, not the tunnel. Scalable, supports dynamic routing. The default.
Route-based decouples the tunnel from the subnets. That indirection is why it scales and survives change.
DimensionPolicy-basedRoute-based
Defines traffic byLocal/remote subnet selectors.Routes into a VTI.
Dynamic routingNo.Yes, BGP over the tunnel.
Adding networksEdit the VPN policy.Add a route; tunnel unchanged.
Best forSimple, fixed, when the far end requires it.Almost everything. The default.
In practice: default to route-based and only fall back to policy-based when the device on the far end cannot do route-based or insists on selectors. The other thing that bites every IPSec build is MTU: encryption adds overhead, so account for the reduced effective MTU through the tunnel or you get the same large-packets-hang symptom from the overlay parts, just over the WAN.

L2VPN: stretching Layer 2

IPSec connects networks that keep their own addressing; L2VPN does something stronger and more specialized: it extends a single Layer 2 segment across two sites, so a VM at the remote site sits on the same broadcast domain and subnet as VMs in your data center, as if the wire were stretched over the WAN. It is built on a route-based IPSec tunnel, with an L2VPN server at one end and a client at the other. The reason this matters is migration and disaster recovery. When you move workloads between sites and cannot re-IP them, or you need a DR site where VMs come up with their production addresses intact, stretching Layer 2 buys you that continuity. It is a powerful tool and a deliberately temporary one: stretched Layer 2 across a WAN is something you use to get through a migration or a failover, not a steady-state design you want to live on forever.

One segment, stretched across two sites Data center (L2VPN server) vm 10.0.5.11 segment web-net L2VPN tunnel Remote site (L2VPN client) vm 10.0.5.12 Same subnet as the data center. No re-IP.
L2VPN keeps VMs on their original subnet across sites. Use it to cross a migration or a failover, then retire it.
Disclaimer: VPN changes affect live connectivity to remote sites. Agree IKE/IPSec/DPD parameters with the far end in advance, account for tunnel MTU, and remember that in an NSX Project a Tier-1 allows only one IPSec and one L2VPN service and does not support VTI BGP. Stage in a maintenance window and validate against the current VCF 9 BOM.

Which VPN for which job

The choice between the three options, policy-based IPSec, route-based IPSec, and L2VPN, falls out cleanly once you name the actual requirement. Most steady-state connectivity, a branch, a partner, another cloud, wants route-based IPSec. Policy-based is what you settle for when the far-end device forces it. L2VPN is reserved for the specific case where addressing must be preserved across sites, which in practice means migration and disaster recovery. The table makes the decision a one-liner.

You need toUseWhy
Connect a branch or partner siteRoute-based IPSecScales, dynamic routing, survives change.
Interop with a fixed legacy devicePolicy-based IPSecThe far end only supports selectors.
Migrate VMs without re-IPL2VPNStretches the subnet; retire it after.
Stand up a DR site at the same IPsL2VPNVMs come up on production addresses.

When a tunnel will not come up

IPSec failures are frustrating because the two ends rarely tell you the same story, and the side that logs the useful error is often the one you do not control. Work the checklist methodically rather than guessing. The phase-one failures are key-exchange mismatches in the IKE profile; the phase-two failures are tunnel-encryption mismatches in the IPSec profile; and the tunnels that establish but pass no traffic are almost always a routing or selector problem, or an MTU problem that only shows up under load. Walk these in order and you resolve the large majority of VPN tickets without a packet capture.

SymptomLikely causeCheck
Never reaches phase oneIKE mismatch or wrong peer/pre-shared keyIKE version, DH group, encryption, PSK.
Phase one up, phase two failsIPSec profile mismatchEncryption, PFS, lifetime on both ends.
Tunnel up, no trafficRouting or selectors wrongVTI routes, or policy subnet pairs.
Small flows work, large hangMTU not accounting for IPSec overheadLower effective MTU or clamp MSS.
My take: build a one-page parameter sheet with both ends’ IKE, IPSec, and DPD settings filled in and signed off before anyone configures anything. That single document prevents most VPN incidents, because it forces the agreement that the protocol silently assumes you already have.

What I’d Do

Make route-based IPSec your standard and treat policy-based as the compatibility fallback for a far end that cannot do better. Agree every IKE, IPSec, and DPD parameter with the other side before you touch the config, because a clean parameter sheet prevents the great majority of tunnel-down tickets, and budget for the MTU overhead that encryption adds. Use L2VPN deliberately and temporarily, to carry a migration or stand up a DR site without re-IP, and plan its retirement as part of the project rather than letting stretched Layer 2 become permanent. And remember these are SR services, so the gateway hosting them needs an Edge cluster, and in a multi-tenant Project each Tier-1 is limited to one IPSec and one L2VPN with static routing only. Next up is Part 18: monitoring and operations with Traceflow, alarms, and Operations for Networks, where we shift from building to running. Are your tunnels route-based, or are you still editing subnet selectors by hand?


Route-based VPN usually beats policy-based

When you stand up an IPSec tunnel you choose between policy-based and route-based, and for most designs route-based is the better answer. Policy-based IPSec matches traffic against selectors, lists of interesting source and destination networks, which is fine for a handful of static subnets and becomes brittle as soon as the topology grows or changes. Route-based IPSec uses a virtual tunnel interface and ordinary routing, which means the tunnel participates in your routing design and scales with it instead of fighting it. You add a network behind the tunnel and routing carries it, rather than editing selector lists on both ends and hoping they stay in sync.

The practical payoff is that route-based tunnels integrate with dynamic routing, so a route-based IPSec connection can carry BGP and adapt as networks come and go, which is exactly what you want for anything beyond a trivial site-to-site link. Policy-based still has its place for simple, static connections to a third party that insists on it, but as a default for your own multi-site connectivity, route-based is the design that ages well. L2VPN, which stretches Layer 2 across sites, is a separate tool entirely and one to use sparingly, because extending a broadcast domain across a WAN carries all the fault-domain risks that stretched networking always does.

VPN lives on the Edge, and so does its MTU math

Every IPSec and L2VPN tunnel terminates on the Edge service router, which has two consequences worth designing for. The first is capacity: VPN is a stateful service running on the Edge, so the encryption and tunnel processing consume Edge resources, and a design with many tunnels or high VPN throughput is really an Edge sizing exercise. Plan the Edge for the VPN load the way you would plan it for any other service, and remember that the same Edge may be carrying north-south forwarding and other services at the same time.

The second consequence is the one that generates support tickets: MTU. IPSec adds its own encapsulation overhead, and when that tunnel rides over an NSX overlay that already added GENEVE overhead, the packet sizes stack up and fragmentation or silent drops follow if the path MTU is not right. The familiar small-works-large-fails signature from the overlay troubleshooting Part applies here too, just with an extra layer of headers. Account for the IPSec overhead in your MTU planning end to end, test with full-size packets across the tunnel, and you avoid the classic VPN-that-pings-but-will-not-move-data problem that the encapsulation overhead quietly creates.

Build redundancy into site-to-site links

A single tunnel between two sites is a single point of failure dressed up as connectivity, and for anything that matters you design the link to survive a failure. That means more than one tunnel, terminating in a way that a single Edge failure or a single path outage does not drop the connection, with routing that fails the traffic over cleanly. This is another place where route-based VPN earns its keep, because failover between route-based tunnels is a routing decision that the network makes for you, where policy-based selectors would leave you reconfiguring under pressure.

Plan the redundancy against the actual availability requirement of what crosses the link. A development connection to a partner may genuinely be fine on a single tunnel; a production replication or a critical integration is not. Match the tunnel redundancy and the Edge placement to that requirement, test the failover deliberately rather than discovering it during an incident, and remember that the link is only as resilient as its least redundant component, which is often the Edge it terminates on. Redundant tunnels over a single Edge are not redundant where it counts.

References

NSX Series · Part 17 of 30
« Previous: Part 16  |  NSX Complete Guide  |  Next: Part 18 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

NSX 9 Series

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading