| Internet-Draft | EVPN VXLAN Bypass VTEP | May 2023 | 
| Eastlake, et al. | Expires 30 November 2023 | [Page] | 
A principal feature of EVPN is the ability to support multihoming from a customer equipment (CE) to multiple provider edge equipment (PE) with all-active links. This draft specifies a mechanism to simplify PEs used with VXLAN tunnels and enhance VXLAN Active-Active reliability.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 30 November 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
A principal feature of EVPN is the ability to support multihoming from a customer equipment (CE) to multiple provider edge equipment (PE) with links used in the all-active redundancy mode. That mode is where a device is multihomed to a group of two or more PEs and where all PEs in such a redundancy group can forward traffic to/from the multihomed device or network for a given VLAN [RFC7209]. This draft specifies a VXLAN gateway mechanism to simplify PE processing in the multi-homed case and enhance VXLAN Active-Active reliability.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This document uses the following acronyms and terms:¶
One example of the current situation would be a DCI (data center interconnect) using VXLAN tunnels that is multihomed for reliability as show in Figure 1. Each PE as a VXLAN Tunnel End Point (VTEP) uses a different IP adress. Thus each PE must process EVPN updates based on the ESIs [RFC7432].¶
                       .........
                       .  DCI  .
     +----------+      .       .      +----------+
     | PE       +---------------------+ PE       |
     |VTEP IP-1 +---   . VXLAN .   ---+VTEP IP-3 |
     +----------+   \  .Tunnels.  /   +----------=
    /     |          -----   -----          |     \
+--+      |            .  \ /  .            |      +--+
|CE|      |            .   X   .            |      |CE|
+--+      |            .  / \  .            |      +--+
    \     |          -----   -----          |    /
     +----------+   /  . VXLAN .  \   +----------+
     | PE       +---   .Tunnels.   ---+ PE       |
     |VTEP IP-2 +---------------------+VTEP IP-4 |
     +----------+      .       .      +----------+
                       .........
The situation is greatly simplified if the set of VTEPs connected to a particular Ethernet segment all use the same anycast IP address. PEs no longer need to conern themselves with whether a remote CE is single or multi-homed. The situation is as shown in Figure 2. The IP address within each VTEP group is synchronized by messages within that group.¶
                       .........
                       .  DCI  .
     +----------+      .       .      +----------+
     | Anycast  |      .       .      | Anycast  |
     |VTEP IP-1 +---   .       .   ---+VTEP IP-2 |
     +----------+   \  .       .  /   +----------=
    /     ^          \ .       . /          ^     \
+--+      |           \.       ./           |      +--+
|CE|    Sy|nc          >-------<          Sy|nc    |CE|
+--+      |           /. VXLAN .\           |      +--+
    \     v          / . Tunnel. \          v    /
     +----------+   /  .       .  \   +----------+
     | Anycast  +---   .       .   ---+ Anycast  |
     |VTEP IP-1 |      .       .      |VTEP IP-2 |
     +----------+      .       .      +----------+
                       .........
In the scenario illustrated in Figure 3, where an enterprise site and a data center are interconnected, the VPN gateways (PE1 and PE2) and the enterprise site (CPE) are connected through a VXLAN tunnel to provide L2/L3 services between the enterprise site (CPE) and data center. The data center gateway (CE1) is dual-homed to PE1 and PE2 to access the VXLAN network, which enhances network access reliability. When one PE fails, services can be rapidly switched to the other PE, minimizing the impact on services.¶
As shown in Figure 3, PE1 and PE2 use a virtual address as a Network Virtualization Edge (NVE) interface address at the network side, namely, the Anycast VTEP address. In this way, the CPE is aware of only one remote NVE interface and establishes a VXLAN tunnel with the virtual address. The packets from the CPE can reach CE1 through either PE1 or PE2. However, single-homed CEs may exist, such as CE2 and CE3. As a result, after reaching a PE, the packets from the CPE may need to be forwarded by the other PE to a single-homed CE. Therefore, a bypass VXLAN tunnel needs to be established between PE1 and PE2. An EVPN peer relationship is established between PE1 and PE2. Different addresses, namely, bypass VTEP addresses, are configured for PE1 and PE2 so that they can establish a bypass VXLAN tunnel.¶
                           +-----+
          ---------------- | CPE |   Enterprise site
             ^             +-----+
             |               / \
             |              /   \
           VXLAN Tunnel    /     \
             |            /       \
             |           / Anycast \
             v      +-----+ VTEP +-----+
          --------- | PE1 |------| PE2 |
                    +-----+      +-----+
                      /\           /\
                     /  \         /  \
                    /    \ Trunk /    \
                   /      \     /      \
                  /       +\---/+       \
                 /        | \ / |        \
                /         +--+--+         \
               /             |             \
           +-----+        +-----+        +-----+
           | CE2 |        | CE1 |        | CE3 |
           +-----+        +-----+        +-----+
This sections specifies the extensions to meet the requirements given in Section 3 and enhance VXLAN active-active reliability.¶
This document specifies two new BGP extended communities, the IPv4 and IPv6 Bypass VXLAN Extended Communities. These extended communities are IPv4-address-specific or IPv6-address-specific, depending on whether the VTEP address to be accommodated is IPv4 or IPv6. In the new extended communities, the 4-byte or 16-byte global administrator field encodes the IPv4 or IPv6 address that is the VTEP address and the 2-byte local administrator field is formatted as shown in Figures 4 and 5.¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=0x01 | Sub-Type=TBA1 | IPv4 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IPv4 Address (cont.) | Flags | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type=0x00/0x40| Sub-Type=TBA2 | Target IPv6 Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Target IPv6 Address (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Target IPv6 Address (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Target IPv6 Address (cont.) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Target IPv6 Address (cont.) | Flags | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where¶
0x01 = type for transitive IPv4 specific use.¶
0x00 = type for transitive IPv6 specific use.¶
0x40 = type for non-transitive IPv6 specific use.¶
TBA1 = subtype for IPv4 specific use.¶
TBA2 = subtype for IPv6 specific use.¶
Using the topology in Figure 3:¶
This section describes how Layer 2 unicast and BUM (Broadcast, Unknown unicast, and Multicast) packets are forwarded. A description of how Layer 3 packets transmitted on the same subnet and Layer 3 packets transmitted across subnets cases are forwarded will be provided in a furture version of this document.¶
The following two subsections discuss Layer 2 unicast forwarding in the topology shown in Figure 3.¶
After receiving Layer 2 unicast packets destined for the CPE from CE1, CE2, and CE3, PE1 and PE2 search for their local MAC address table to obtain outbound interfaces, perform VXLAN encapsulation on the packets, and forward them to the CPE.¶
After receiving a Layer 2 unicast packet sent by the CPE to CE1, PE1 performs VXLAN decapsulation on the packet, searches the local MAC address table for the destination MAC address, obtains the outbound interface, and forwards the packet to CE1.¶
After receiving a Layer 2 unicast packet sent by the CPE to CE2, PE1 performs VXLAN decapsulation on the packet, searches the local MAC address table for the destination MAC address, obtains the outbound interface, and forwards the packet to CE2.¶
After receiving a Layer 2 unicast packet sent by the CPE to CE3, PE1 performs VXLAN decapsulation on the packet, searches the local MAC address table for the destination MAC address, and forwards it to PE2 over the bypass VXLAN tunnel. After the packet reaches PE2, PE2 searches the destination MAC address, obtains the outbound interface, and forwards the packet to CE3.¶
The process for PE2 to forward packets from the CPE is the same as that for PE1 to forward packets from the CPE with the roles of CE2 and CE3 swapped.¶
Using the topology in Figure 3, if the destination address of a BUM packet from the CPE is the Anycast VTEP address of PE1 and PE2, the BUM packet may be forwarded to either PE1 or PE2. If the BUM packet reaches PE2, PE2 sends a copy of the packet to CE3 and CE1. In addition, PE2 sends a copy of the packet to PE1 through the bypass VXLAN tunnel between PE1 and PE2. After the copy of the packet reaches PE1, PE1 sends it to CE2, not to the CPE or CE1. In this way, CE1 receives only one copy of the packet.¶
Using the topology in Figure 3, after a BUM packet from CE2 reaches PE1, PE1 sends a copy of the packet to CE1 and the CPE. In addition, PE1 sends a copy of the packet to PE2 through the bypass VXLAN tunnel between PE1 and PE2. After the copy of the packet reaches PE2, PE2 sends it to CE3, not to the CPE or CE1.¶
Using the topology in Figure 3, after a BUM packet from CE1 reaches PE1, PE1 sends a copy of the packet to CE2 and the CPE. In addition, PE1 sends a copy of the packet to PE2 through the bypass VXLAN tunnel between PE1 and PE2. After the copy of the packet reaches PE2, PE2 sends it to CE3, not to the CPE or CE1.¶
IANA is requested to assign two new Extended Community attribute SubTypes as follows:¶
| Sub-Type Value | Name | Reference | 
|---|---|---|
| TBA1 | Bypass VXLAN Extended Community | [this doc] | 
| Sub-Type Value | Name | Reference | 
|---|---|---|
| TBA2 | Bypass VXLAN Extended Community | [this doc] | 
TBD¶
The authors would like to thank the following for their comments and review of this document: TBD.¶
Thanks to the following who made significant contributions to this document:¶