<?xml version="1.0" encoding="US-ASCII"?>
<!-- $Id: draft-ietf-bess-evpn-geneve-06.xml 2015-07-05 sboutros $ -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 SYSTEM "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
]>
<rfc category="exp" docName="draft-ietf-bess-evpn-geneve-09"
     ipr="trust200902" updates="">
  <?rfc toc="yes" ?>

  <?rfc compact="yes"?>

  <?rfc subcompact="no"?>

  <?rfc symrefs="yes"?>

  <?rfc sortrefs="yes" ?>

  <front>
    <title abbrev="EVPN control plane for Geneve">EVPN control plane for Geneve</title>

    <author fullname="Sami Boutros" initials="S." surname="Boutros" role="editor">
      <organization></organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <code></code>

          <region></region>

          <country>USA</country>
        </postal>

        <email>boutros.sami@gmail.com</email>
      </address>
    </author>

    <author fullname="Ali Sajassi" initials="A." surname="Sajassi">
      <organization>Cisco Systems</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <code></code>

          <region></region>

          <country>USA</country>
        </postal>

        <email>sajassi@cisco.com</email>
      </address>
    </author>

    <author fullname="John Drake" initials="J." surname="Drake">
      <organization>Juniper Networks</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <code></code>

          <region></region>

          <country>USA</country>
        </postal>

        <email>jdrake@juniper.net</email>
      </address>
    </author>

    <author fullname="Jorge Rabadan" initials="J." surname="Rabadan">
      <organization>Nokia</organization>

      <address>
        <postal>
          <street></street>

          <city></city>

          <code></code>

          <region></region>

          <country>USA</country>
        </postal>

        <email>jorge.rabadan@nokia.com</email>
      </address>
    </author>

    <author fullname="Sam Aldrin" initials="S." surname="Aldrin">
      <organization>Google</organization>
    
      <address>
        <postal>
          <street></street>

          <city></city>

          <code></code>

          <region></region>

          <country>USA</country>
        </postal>

        <email>aldrin.ietf@gmail.com</email>
      </address>
    </author>


    <date year="2025"/>

    <area>Routing</area>

    <workgroup>BESS Workgroup</workgroup>

    <abstract>
      <t>This document describes how Ethernet VPN (EVPN) control plane can be
      used with Network Virtualization Overlay over Layer 3 (NVO3) Generic
      Network Virtualization Encapsulation (Geneve) encapsulation for NVO3
      solutions.</t>

      <t>EVPN control plane can also be used by Network Virtualization
      Edges (NVEs) to express Geneve tunnel option TLV(s) supported in
      the transmission and/or reception of Geneve encapsulated data packets.</t>
    </abstract>
  </front>

  <middle>
    <section title="Introduction">
   <t>The Network Virtualization over Layer 3 (NVO3) solutions for network
   virtualization in data center (DC) environment are based on an IP-based
   underlay. An NVO3 solution provides layer 2 and/or layer 3 overlay services
   for virtual networks enabling multi-tenancy and workload mobility. </t>

   <t>This document describes how the EVPN control plane defined in <xref target="RFC7432"/>
   can signal Geneve encapsulation type in the BGP Tunnel Encapsulation
   Extended Community defined in <xref target="RFC9012"/>. In addition, this 
   document defines how to communicate the Geneve tunnel option types using
   BGP Tunnel Encapsulation Attribute sub-TLV. The Geneve tunnel options
   are encapsulated as TLVs after the Geneve base header in the Geneve
   packet as described in <xref target="RFC8926"/>.</t>

   <t><xref target="I-D.ietf-nvo3-encap"/> recommends that a control plane
   determine how Network Virtualization Edges (NVEs) use the Geneve option
   TLVs when sending/receiving packets. In particular, the control plane
   negotiates the subset of option TLVs supported, their order and the
   total number of option TLVs allowed in the packets. This negotiation
   capability allows, for example, interoperability with hardware-based
   NVEs that can process fewer options than software-based NVEs.</t>

   <t>This EVPN control plane extension will allow an NVE to express what
   Geneve option TLV types it is capable of receiving, or sending 
   over the Geneve tunnel with its peers.</t>

   <t>In the datapath, a transmitting NVE MUST NOT encapsulate a packet
   destined to another NVE with any option TLV(s) the receiving NVE is
   not capable of processing.</t>

   <t>Furthermore, the document defines a new ethernet option TLV to handle
   BUM traffic, etree root and leaf indication, and split horizon.</t>

    </section>

    <section title="Terminology">
      <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
      "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
      document are to be interpreted as described in <xref
      target="RFC2119"/>.</t>
    </section>

    <section title="Abbreviations and Terminology">

   <t>NVO3: Network Virtualization Overlays over Layer 3</t> 

   <t>Geneve: Generic Network Virtualization Encapsulation.</t> 

   <t>NVE: Network Virtualization Edge.</t>

   <t>VNI:  Virtual Network Identifier.</t> 

   <t>MAC: Media Access Control.</t> 

   <t>OAM: Operations, Administration and Maintenance.</t>  

   <t>PE: Provide Edge Node.</t>
   
   <t>CE: Customer Edge device e.g., host or router or switch.</t>

   <t>EVPN: Ethernet VPN.</t>

   <t>ES: Ethernet segment.</t>

   <t>ESI: Ethernet Segment Identifier.</t>

   <t>EVI: An EVPN instance spanning the Provider Edge (PE) devices
      participating in that EVPN.</t>

   <t>MAC-VRF: A Virtual Routing and Forwarding table for Media Access
      Control (MAC) addresses on a PE.</t>

     </section>

    <section title="Geneve extension">
   <t>This document adds an extension to the <xref target="RFC8926"/>
   encapsulation that is relevant to the operation of EVPN.</t>
    
    <section title="Ethernet option TLV">
      <t><xref target="RFC8365"/> describes when an ingress NVE uses ingress replication
   to flood unknown unicast traffic to the egress NVEs, the ingress NVE
   needs to indicate to the egress NVE that the Encapsulated packet is a
   BUM packet. This is required to avoid transient packet duplication
   in all-active multi-homing scenarios. For Geneve, we need a bit for
   this purpose.</t>

       <t><xref target="RFC8317"/> uses an MPLS label for leaf indication of BUM traffic
   originated from a leaf attachment circuit (AC) in an ingress NVE so that the egress NVEs
   can filter BUM traffic toward their leaf ACs. For Geneve, we need a bit for this purpose.</t>

       <t>Although the default mechanism for split-horizon filtering of BUM
   traffic on an Ethernet segment for IP-based encapsulations such as
   VxLAN, GPE, NVGRE, and Geneve, is local-bias as defined in section
   8.3.1 of <xref target="RFC8365"/>, there can be an incentive to leverage the
   same split-horizon filtering mechanism of <xref target="RFC7432"/> that uses a 20-bit MPLS label so that    a) the a single filtering mechanism is used
   for all encapsulation types and   b) the same PE can participate in a
   mix of MPLS and IP encapsulations. For this purpose a 20-bit label
   field MAY be defined for Geneve encapsulation. The support for this
   label is OPTIONAL.</t>

      <t>If an NVE wants to use local-bias procedure, then it sends the new
   option TLV with ESI-label=0</t>

    <figure align="left">
      <preamble/>

    <artwork align="left"><![CDATA[

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |     Option Class=Ethernet     |C|  EVPN-OPTION|R|R|R| Len=0x2 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |B|L|H| Rsvd  |             = 0                                 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     
            Figure 1: Ethernet Option TLV without ESI label
            ]]></artwork>
        </figure>
      <t>If an NVE wants to use ESI-label, then it sends the new option TLV
   with  a non zero ESI-label</t>

 <figure align="left">
          <preamble/>

          <artwork align="left"><![CDATA[

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |     Option Class=Ethernet     |C|  EVPN-OPTION|R|R|R| Len=0x2 |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     |B|L|H| Rsvd  |             Source-ID                           |
     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

            Figure 2: Ethernet Option TLV with ESI label
]]></artwork>
        </figure>

     <t>Where: </t>
        <t> - Option Class is set to Ethernet (new Option Class requested to IANA) </t>
        <t> - Type is set to EVPN-OPTION with value = 0, and C bit must be set. </t>
        <t> - B bit is set to 1 for BUM traffic. </t>
        <t> - L bit is set to 1 for Leaf-Indication. </t>
        <t> - H bit is set to 1 for Root-Indication. </t>
        <t> - Source-ID is a 24-bit value that encodes the ESI-label value
        signaled on the EVPN Autodiscovery per-ES routes, as described
        in <xref target="RFC7432"/> for multi-homing and <xref target="RFC8317"/> for leaf-to-leaf
        BUM filtering. The ESI-label value is encoded in the high-order
        20 bits of the Source-ID field.</t>

    <t>The egress NVEs that make use of ESIs in the data path because they
   have a local multi-homed ES or support <xref target="RFC8317"/> SHOULD advertise
   their Ethernet A-D per-ES routes along with the Geneve tunnel sub-TLV
   in addition to the ESI-label Extended Community. The ingress NVE
   can then use the Ethernet option-TLV when sending Geneve packets
   based on the <xref target="RFC7432"/> and <xref target="RFC8317"/> procedures.
   The egress NVE will use the Source-ID field in the received packets to make filtering
   decisions.</t>

    <t>Note that <xref target="RFC8365"/> modifies the <xref target="RFC7432"/> split-horizon
   procedures for NVO3 tunnels using the "local-bias" procedure. "Local-bias"
   relies on tunnel IP source address checks (instead of ESI-labels) to
   determine whether a packet can be forwarded to a local ES.</t>

   <t>While "local-bias" MUST be supported along with Geneve encapsulation,
   the use of the Ethernet option-TLV is RECOMMENDED to follow the same
   procedures used by EVPN MPLS.</t>

   <t>An ingress NVE using ingress replication to flood BUM traffic MUST
   send B=1 in all the Geneve packets that encapsulate BUM frames. An
   egress NVE SHOULD determine whether a received packet encapsulates a
   BUM frame based on the B bit. The use of the B bit is only relevant
   to Geneve packets with Protocol Type 0x6558 (Bridged Ethernet).</t>

    </section>
   </section>

    <section title="BGP Extensions">

   <t>As per <xref target="RFC8365"/> the BGP Encapsulation extended community
   defined in <xref target="RFC9012"/> is included with all EVPN routes advertised
   by an egress NVE.</t>

   <t>This document uses the Geneve Encapsulation BGP Tunnel Encapsulation Type
      from the IANA BGP Tunnel Encapsulation Types registry, Value = 19. </t>

    <section title="Geneve Tunnel Option Types sub-TLV">

   <t>The Geneve tunnel option types is a new BGP Tunnel Encapsulation
   Attribute Sub-TLV.</t>

 <figure align="left">
          <preamble/>

          <artwork align="left"><![CDATA[
                      +-----------------------------------+
                      |      Sub-TLV Type (1 Octet)       |
                      +-----------------------------------+
                      |     Sub-TLV Length (2 Octets)     |
                      +-----------------------------------+
                      |     Sub-TLV Value (Variable)      |
                      |                                   |
                      +-----------------------------------+


        Figure 3: Geneve tunnel option types sub-TLV
]]></artwork>
    </figure>

     <t>The Sub-TLV Type field contains a value in the range from 192-252.
     To be allocated by IANA.</t>

   <t>Sub-TLV value MUST match exactly the first 4-octets of the option TLV
   format. For instance, if we need to signal support for two option
   TLVs:</t>

 <figure align="left">
          <preamble/>

          <artwork align="left"><![CDATA[
      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |          Option Class         |      Type     |R|R|R| Length  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |          Option Class         |      Type     |R|R|R| Length  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                       Figure 4: Geneve Option TLVs

]]></artwork>
    </figure>

   <t>An NVE informs its peers which Geneve option TLVs it can receive
   by including the first 4 bytes of each option TLV in the Geneve Tunnel
   Option Types sub-TLV.  The peers MUST send Geneve packets to this
   NVE with only the option TLVs that it has specified here, following the
   same order.</t>

   <t>The above sub-TLV(s) MAY be included with Ethernet A-D per-ES routes and
   MUST NOT be included with other routes.</t>

   </section>
   </section>

   <section title="Operation">

   <t>The following figure shows an example of an NVO3 deployment with
   EVPN.</t>

 <figure align="left">
          <preamble/>

          <artwork align="left"><![CDATA[

                                 +--------------+
                                 |              |
                 +---------+     |     WAN      |    +---------+
         +----+  |         |   +----+        +----+  |         |  +----+
         |NVE1|--|         |   |ASBR|        |ASBR|  |         |--|NVE3|
         +----+  |IP Fabric|---| 1  |        |  2 |--|IP Fabric|  +----+
         +----+  |         |   +----+        +----+  |         |  +----+
         |NVE2|--|         |     |              |    |         |--|NVE4|
         +----+  +---------+     +--------------+    +---------+  +----+

         |<------ DC 1 ----->                        <---- DC2  ------>|

                 Figure 5: Data Center Interconnect with ASBR
]]></artwork>
    </figure>

   <t>iBGP sessions are established between NVE1, NVE2, ASBR1, possibly via
   a BGP route-reflector. Similarly, iBGP sessions are established
   between NVE3, NVE4, ASBR2.</t>

   <t>eBGP sessions are established among ASBR1 and ASBR2.</t>

   <t>All NVEs and ASBRs are enabled for the EVPN SAFI and exchange EVPN
   routes.  For inter-AS option B, the ASBRs re-advertise these routes
   with NEXT_HOP attribute set to their IP addresses as per <xref target="RFC4271"/>.</t>

   <t>NVE1 sets the BGP Encapsulation extended community defined in all
   EVPN routes advertised. NVE1 sets the BGP Tunnel Encapsulation
   Attribute Tunnel Type to Geneve tunnel encapsulation, and sets the
   Tunnel Encapsulation Attribute Tunnel sub-TLV for the Geneve tunnel
   option types with all the Geneve option types it can transmit and
   receive.</t>

   <t>All other NVE(s) learn what Geneve option types are supported by NVE1
   through the EVPN control plane. In the datapath, NVE2, NVE3 and NVE4 MUST
   only encapsulate overlay packets with the Geneve option TLV(s) that
   NVE1 is capable of receiving, and in case more than one option TLV is being used,
   they MUST be in the order specified by NVE1.</t>

   <t>A PE advertises the BGP Encapsulation extended community defined in
   <xref target="RFC5512"/> if it supports any of the encapsulations defined in <xref target="RFC8365"/>.
   A PE advertises the BGP Tunnel Encapsulation Attribute
   defined in <xref target="RFC9012"/> if it supports Geneve encapsulation, setting the type to Geneve Encapsulation.</t>

   </section>

   <section title="Security Considerations">

      <t>The mechanisms in this document uses EVPN control plane as defined in
   <xref target="RFC7432"/>. Security considerations described in <xref target="RFC7432"/> are equally
   applicable.</t>

   <t>This document uses IP-based tunnel technologies to support data plane
   transport. Security considerations described in <xref target="RFC7432"/> and in
   <xref target="RFC8365"/> are equally applicable.</t>

    </section>
   <section title="IANA Considerations">
 <figure align="left">
          <preamble/>

          <artwork align="left"><![CDATA[
   IANA is requested to assign a new option class from the "Geneve Option Class"
   First Come First Served ranges registry for the Ethernet option TLV.

   Option Class Description     Reference
   ------------ --------------- -------------
   XXXX         Ethernet option This document

   IANA is requested to assign a new BGP Tunnel Encapsulation Attribute
   Sub-TLV from the BGP Tunnel Encapsulation Attribute Sub-TLVs
   First Come First Served ranges registry.

   BGP Tunnel Attribute Sub-TLV Description               Reference
   ---------------------------- ------------------------- -------------
   XXXX                         Geneve tunnel option type This document
   

  ]]></artwork>
        </figure>
  </section>

  <section title="Acknowledgements">

  <t>The authors wish to thank T. Sridhar, for his input, feedback, and
   helpful suggestions.</t>

  </section>
  </middle>
  <back>
    <references title="Normative References">

      <?rfc include="reference.RFC.2119"?>
      <?rfc include="reference.RFC.7432"?>
      <?rfc include="reference.RFC.8317"?>
      <?rfc include="reference.RFC.4271"?>
      <?rfc include="reference.RFC.5512"?>
      <?rfc include="reference.RFC.8926"?>
      <?rfc include="reference.RFC.9012"?>
      <?rfc include="reference.RFC.8365"?>
      <?rfc include="reference.I-D.ietf-nvo3-encap"?>
 
    </references>

    <references title="Informative References">
      <?rfc include="reference.RFC.7365"?>
    </references>
  </back>
</rfc>

