VXLAN routing, sometimes referred to as inter-VXLAN routing, provides IP routing between VXLAN VNIs in overlay networks. The routing of traffic is based on the inner header or the overlay tenant IP address.
VXLAN routing is supported on the following platforms:
- Broadcom Tomahawk using an internal loopback on one or more switch ports
Broadcom Maverick and Trident II+
If you want to use VXLAN routing on a Trident II switch, you must use a hyperloop.
Switches with the Mellanox Spectrum A0 ASIC can only operate in asymmetric mode.
Features of VXLAN routing include:
- EVPN is the control plane
- VRF support for overlay networks
- Distributed asymmetric routing
On Broadcom Tomahawk platforms, symmetric routing is supported by keeping the port in internal loopback mode. Symmetric EVPN routing is supported on Trident II+ ASICs without hyperloop cables for external routing only.
On Mellanox platforms, symmetric VXLAN routing is supported only on switches with certain Spectrum ASICs. To configure symmetric VXLAN routing on a Mellanox switch, please contact the Cumulus Networks support team.
- Anycast routing and gateways
- Host routing between and within data centers
Using EVPN as the control plane offers an integrated routing and bridging solution as well as multi-tenancy support, where different customers can share an IP address in the same network fabric.
VXLAN routing currently does not support overlay ECMP.
VXLAN Routing Data Plane and the Broadcom Tomahawk and Trident II+ Platforms
On switches with Broadcom ASICs, VXLAN routing is supported only on the Tomahawk and Trident II+ platforms. Below are some differences in how VXLAN routing works on these switches.
For Trident II+ switches, you can specify a VXLAN routing (RIOT — routing in and out of tunnels) profile in the
vxlan_routing_overlay.profile field in the
/usr/lib/python2.7/dist-packages/cumulus/__chip_config/bcm/datapath.conf file if you don't want to use the default. This profile determines the maximum number of overlay next hops (adjacency entries). The profile is one of the following:
- default: 15% of the underlay next hops are set apart for overlay, up to a maximum of 8k next hops
- mode-1: 25% of the underlay next hops are set apart for overlay
- mode-2: 50% of the underlay next hops are set apart for overlay
- mode-3: 80% of the underlay next hops are set apart for overlay
- disable: disables VXLAN routing
The Trident II+ ASIC supports a maximum of 48k underlay next hops.
The maximum number of VXLAN SVI interfaces that can be allocated is 2k (2048) regardless of which profile you specify.
If you want to disable VXLAN routing on a Trident II+ switch, set the
vxlan_routing_overlay.profile field to disable.
The Tomahawk ASIC does not support RIOT natively, so you must configure the switch ports for VXLAN routing to use the internal loopback. The internal loopback facilitates the recirculation of packets through the ingress pipeline to achieve VXLAN routing. One or more loopback switch ports can be bundled into a loopback trunk based on the amount of bandwidth needed.
To configure one or more switch ports for loopback mode, edit the
/etc/cumulus/ports.conf file, changing the port speed to loopback. In the example below, swp8 and swp9 are configured for loopback mode:
After you save your changes to the
ports.conf file, you must restart
switchd for the changes to take effect.
Configuring VXLAN Routing
The following configuration using a single VTEP and does not include a VRF. It uses elements from the following topology:
When configuring VXLAN routing, Cumulus Networks recommends that you enable ARP suppression on all VXLAN interfaces. Otherwise, when a locally-attached host ARPs for the gateway, it will receive multiple responses, one from each anycast gateway.
Configuring the Underlays
Configure the loopback address on the following network devices.
Advertise the loopback addresses into the underlay.
Review and commit your changes:
Verify the loopback addresses are advertised and learned by all VTEPs (look for the line that starts with B>*).
Configuring the Server-facing Downlinks
Create routed VLANs for the servers. All virtual IP addresses (that is, VRR) are the same since this example configuration uses anycast gateways. See the diagram above for connectivity.
The real IP addresses assigned to each SVI must be unique per VTEP. The virtual address can be reused as the anycast gateway.
Configuring BGP EVPN
Configure the VTEPs to advertise layer 2 MAC address information via EVPN.
- Verify EVPN is peering.
Configuring the VXLANs
- Configure the VNIs on each VTEP.
- Disable bridge learning and enable ARP suppression for VXLAN routing, then review and commit your changes.
- Verify the VXLAN entries are being learned in EVPN.
- Verify the VXLAN entries are programmed into the bridge table.
Following are the resulting interfaces and routing configurations for the three nodes you configured above: leaf01, leaf03 and spine01.
VXLAN Routing with Active-Active VTEPs
VXLAN routing with active-active VTEPs is configured the same way as VXLAN with active-active mode VTEPs. Follow the instructions located in the VXLAN and EVPN Active-Active chapter.
VXLAN with VRFs
VXLAN can be configured with VRF support. In order to do so, just apply the server downlink SVI configuration on the top of rack switches inside a VRF. The BGP EVPN address family and
advertise-all-vni command are smart enough to apply the correct RD and RT information to each VRF.
Below is an example for leaf01 where VLAN 100 and VLAN 150 are part of VRF RED and can VXLAN route between each other. VLAN 200 is part of VRF BLUE and cannot communicate with any hosts in VRF RED:
Viewing VXLAN Routing Information
You can use the following commands to display VXLAN routing-related information:
- ip link show dev <DEVICE>
- ip route
- ip neighbor
- bridge fdb show
To get basic information about a VXLAN, use
ip link show:
To view the routing table, use
To view the neighbor table, run
To view the forwarding database, use
bridge fdb show:
Troubleshooting VXLAN Routing
You can investigate control plane VXLAN routing issues with the following commands:
- net show bgp l2vpn evpn route
- net show bgp l2vpn evpn route vni <vni>
- net show bgp l2vpn evpn vni
- net show l2vpn evpn mac vni <vni>
- net show l2vpn evpn arp-cache vni <vni>