RIFT Working Group A. Przygienda, Ed. Internet-Draft Juniper Networks Intended status: Standards Track 7 July 2024 Expires: 8 January 2025 AD-RIFT: Adaptive RIFT draft-przygienda-rift-adrift-00 Abstract Adaptive RIFT (AD-RIFT for short) extends RIFT to carry additional link and node information, primarily traffic engineering related. This enables the southbound computation to optimally place traffic depending on available resources and usage. Additionally, a selective disaggregation, similar to negative disaggregation, is introduced that allows to balance northbound forwarding toward specific prefixes between nodes at the same level depending on resources available. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 8 January 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. Przygienda Expires 8 January 2025 [Page 1] Internet-Draft RIFT July 2024 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Extra Node Information TIE _AdditionalNodeTIEElement_ . . . . 2 3. Traffic Engineering Prefix Preference _TEPrefixPreferenceTIEElement_ . . . . . . . . . . . . . 3 4. Schema Extensions . . . . . . . . . . . . . . . . . . . . . . 4 5. Normative References . . . . . . . . . . . . . . . . . . . . 6 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 6 1. Introduction RIFT in its original form [I-D.ietf-rift-rift] does not propagate any kind of metrics beside bandwidth and general cost. It also abstains from propagating dynamic metrics across the fabric. In environments which rely on, for lack of a better term, adaptive routing; adaptive meaning here reaction to link usage or more complex metrics like delay, additional metrics are needed. One such use case is the placement of long-lived, fairly fat flows in the fabric and with some additional metrics and mechanisms RIFT is ideally suited for the task executed in a distributed way due to its absence on dependence on shortest path forwarding. 2. Extra Node Information TIE _AdditionalNodeTIEElement_ To start with, as a fairly obvious mechanism a new TIE, identical in its scope to a node TIE is introduced. Those TIEs SHOULD be flooded at low priority and contain different link metrics per traffic class in _AdditionalNodeInfoTIEElement_ helping in the computation of southbound direction forwarding. Since the additional info contains many variables describing link properties for advanced computation the lack of such TIEs from a node may lead to its links not being considered for forwarding. Przygienda Expires 8 January 2025 [Page 2] Internet-Draft RIFT July 2024 3. Traffic Engineering Prefix Preference _TEPrefixPreferenceTIEElement_ Since for scaling reasons nodes in the south direction do not see full topology and operate under normal conditions using a default route only, another mechanism has to be used to steer traffic accordingly to available resources for prefixes that are exceptional in terms of available resources. A similar mechanism in the form of negative disaggregation is already implemented in RIFT, and it can be considered in a sense the most extreme form of traffic steering, namely complete rejection to accept traffic to a prefix. Accordingly, a new TIE type _TEPrefixPreferenceTIEElement_ is introduced that carries a negative preference in regard to a traffic class for specific prefixes and follows the prefix TIE flooding scope. This mechanism preconditions, during north computation, just like with negative disaggregation, that a node modifies the next-hop gateway weights based on the next-hops of the less specific aggregate. In the southern direction the analogy to negative disaggregation holds as well. A ToF node performs computation from the view points of all nodes in its plane and on realization that it has less resources to a prefix than other ToFs it issues a _TEPrefixPreferenceTIEElement_. The analogy to negative disaggregation is imperfect however when it comes to transitivity. Generally, not all nodes northbound will issue a _TEPrefixPreferenceTIEElement_ since the node with the best resource availability will abstain from generating it. Hence, the prefix is always reachable and northbound wise all the nodes at same level see same preference. However, a node could generate a _TEPrefixPreferenceTIEElement_ for the prefix again southbound if it sees a large imbalance northbound of it though here an extended BAD computation can take care of this as well. In terms of preference negative disaggregation computation has to be performed first and only after the according gateways have been constructed or purged can the according TE prefix preference modify the gateway preferences further. Observe that negative disaggregation can operate on complete different prefix resolution than TE preference, i.e. negative disaggregation may choose to suppress forwarding to a prefix but TE preference may disaggregate a more specific prefix and still choose to forward towards it although negative disaggregation suppressed a less specific to some of the gateways already. Przygienda Expires 8 January 2025 [Page 3] Internet-Draft RIFT July 2024 Just like in case of negative disaggregation the lack of prefix preference advertisement from a node MUST be interpreted as the node accepting traffic to the prefix with maximum preference. 4. Schema Extensions Due to changes in the TIEType adding new TIE type with a scope different from a prefix TIE scope new major version of the schema MUST be introduced, i.e. AD-RIFT is not compatible with normal RIFT although an attempt could be made to introduce a minor version with according _NodeCapabilities_ extensions given the TIEs are completely optional. /** Type of TIE. */ enum TIETypeType { Illegal = 0, TIETypeMinValue = 1, /** first legal value */ NodeTIEType = 2, PrefixTIEType = 3, PositiveDisaggregationPrefixTIEType = 4, NegativeDisaggregationPrefixTIEType = 5, PGPrefixTIEType = 6, KeyValueTIEType = 7, ExternalPrefixTIEType = 8, PositiveExternalDisaggregationPrefixTIEType = 9, TEPrefixPreferenceTIEType = 10, AdditionalNodeInfoTIEType = 11, TIETypeMaxValue = 12, } typedef i8 TrafficClassType /** The preference to carry traffic to this prefix compared to other nodes in the same level. Higher numbers show higher preference to carry traffic */ typedef i8 RelativePreferenceType struct TEPrefixPreferenceTIEElement { 1: required map> prefixes; } typedef i32 DelayinNanoSecType typedef i32 DelayVariationPerMilliSecinNanoSecType typedef i32 PacketLossPerBillionPacketsType typedef i32 AdministrativeGroupType Przygienda Expires 8 January 2025 [Page 4] Internet-Draft RIFT July 2024 struct AdditionalLinkInfoTypePerTrafficClassType { 1: optional PacketLossPerBillionPacketsType loss; 2: optional DelayVariationPerMilliSecinNanoSecType delay_variation; 3: optional common.BandwithInMegaBitsType used_bandwidth; 4: optional common.BandwithInMegaBitsType maximum_available_bandwidth; } struct AdditionalLinkInfoType { 1: optional DelayinNanoSecType min_delay; 2: optional DelayinNanoSecType max_delay; 3: optional common.BandwithInMegaBitsType bandwidth; 4: optional AdministrativeGroupType admin_group; 10: optional map per_class_info; } struct AdditionalNodeInfoTIEElement { 1: optional map links; } /** Single element in a TIE. */ union TIEElement { /** Used in case of enum common.TIETypeType.NodeTIEType. */ 1: optional NodeTIEElement node; /** Used in case of enum common.TIETypeType.PrefixTIEType. */ 2: optional PrefixTIEElement prefixes; /** Positive prefixes (always southbound). */ 3: optional PrefixTIEElement positive_disaggregation_prefixes; /** Transitive, negative prefixes (always southbound) */ 5: optional PrefixTIEElement negative_disaggregation_prefixes; /** Externally reimported prefixes. */ 6: optional PrefixTIEElement external_prefixes; /** Positive external disaggregated prefixes (always southbound). */ 7: optional PrefixTIEElement positive_external_disaggregation_prefixes; /** Key-Value store elements. */ 9: optional KeyValueTIEElement keyvalues; /** **/ 10: optional TEPrefixPreferenceTIEElement traffic_engineering_preference_prefix; 11: optional AdditionalNodeInfoTIEElement additional_node; /** **/ } Przygienda Expires 8 January 2025 [Page 5] Internet-Draft RIFT July 2024 Figure 1: RIFT information distribution 5. Normative References [I-D.ietf-rift-rift] Przygienda, T., Head, J., Sharma, A., Thubert, P., Rijsman, B., and D. Afanasiev, "RIFT: Routing in Fat Trees", Work in Progress, Internet-Draft, draft-ietf-rift- rift-24, 23 May 2024, . Author's Address Tony Przygienda (editor) Juniper Networks 1137 Innovation Way Sunnyvale, CA 94089 United States of America Email: prz@juniper.net Przygienda Expires 8 January 2025 [Page 6]