Fast Network Notifications H. Song Internet-Draft Futurewei Technologies Intended status: Informational J. Dong Expires: 25 December 2026 Huawei 23 June 2026 A Framework for Fast Network Notifications draft-song-fann-framework-00 Abstract Many network applications, ranging from Artificial Intelligence (AI) / Machine Learning (ML) training and inference to large-scale cloud services, require networks with various combinations of high bandwidth, low delay, low jitter, and minimal packet loss. Meeting these requirements depends on the network's ability to adapt rapidly to faults, signal degradation, and congestion. The companion problem statement describes why existing mechanisms are too slow, too coarse, or too resource-intensive to react within the timescales at which modern forwarding hardware can detect and disseminate intended conditions. This document defines a framework for Fast Network Notifications (FANN). It describes a reference architecture, the functional roles involved in generating and consuming notifications, an information model, delivery and scoping models, procedures for discovery, registration, and subscription, and the integration of fast network notifications with existing Layer 2 to 4 mechanisms. This framework is intended to guide the development of one or more fast network notification protocol specifications. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 25 December 2026. Song & Dong Expires 25 December 2026 [Page 1] Internet-Draft FANN Framework June 2026 Copyright Notice Copyright (c) 2026 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Fast Network Notification Framework . . . . . . . . . . . . . 5 3.1. Design Principles and Goals . . . . . . . . . . . . . . . 5 3.2. Deployment Scenarios . . . . . . . . . . . . . . . . . . 7 3.3. Reference Architecture . . . . . . . . . . . . . . . . . 8 3.3.1. Functional Roles . . . . . . . . . . . . . . . . . . 8 3.3.2. Notification Lifecycle . . . . . . . . . . . . . . . 10 3.3.3. Detection Assumptions and Constraints . . . . . . . . 10 3.4. Information Model . . . . . . . . . . . . . . . . . . . . 11 3.5. Delivery and Scoping . . . . . . . . . . . . . . . . . . 12 3.5.1. Delivery Modes . . . . . . . . . . . . . . . . . . . 12 3.5.2. Transport Considerations . . . . . . . . . . . . . . 13 3.5.3. Notification Domain and Isolation . . . . . . . . . . 13 3.6. Discovery, Registration, and Subscription . . . . . . . . 14 3.7. Loop Prevention, Filtering, and Damping . . . . . . . . . 14 4. Realization and Operational Considerations . . . . . . . . . 15 4.1. Integration with Existing Technologies . . . . . . . . . 15 4.2. Candidate Realization Approaches . . . . . . . . . . . . 16 4.3. Illustrative Applications . . . . . . . . . . . . . . . . 18 4.4. Scaling Considerations . . . . . . . . . . . . . . . . . 18 4.5. Operational Considerations . . . . . . . . . . . . . . . 19 5. Security Considerations . . . . . . . . . . . . . . . . . . . 19 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 20 7.1. Normative References . . . . . . . . . . . . . . . . . . 20 7.2. Informative References . . . . . . . . . . . . . . . . . 20 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 24 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 24 Song & Dong Expires 25 December 2026 [Page 2] Internet-Draft FANN Framework June 2026 1. Introduction Modern high-performance networks, in particular data center (DC) and data center interconnect (DCI) fabrics serving AI/ML and cloud workloads, demand rapid adaptation to changing network conditions. A single fiber link failure, signal degradation, or transient congestion event can stall a distributed training job, waste compute and energy, and degrade service experience [I-D.ietf-rtgwg-net-notif-ps]. Contemporary forwarding hardware can detect link failures, signal degradation reported as link errors, queue buildup, microbursts, and output-queue congestion at microsecond to sub-millisecond timescales. However, the time required to disseminate this information to the remote nodes that can act on it typically far exceeds the detection time. This gap between detection and reaction is the central problem that fast network notifications address. The Fast Network Notifications Problem Statement [I-D.ietf-rtgwg-net-notif-ps] documents the need for a fast notification mechanism and the limitations of existing approaches. The companion requirements [I-D.geng-fantel-fantel-requirements] and gap analysis [I-D.geng-fantel-fantel-gap-analysis] documents elaborate the requirements and the deficiencies of current technologies. Built on these documents, this document defines a framework to describe the overall architecture, the functional roles, the information carried, how notifications are delivered and scoped, and how the mechanism integrates with existing protocols and technologies across layers. This informational document does not define a wire protocol, encoding, or YANG model. Those are expected to be specified in separate protocol and management documents that build on this framework. 1.1. Scope This framework applies to limited-domain networks under a single administrative control, consistent with the deployment assumptions of the FANN charter. It prioritizes the requirements of DC and DCI networks where rapid responsiveness is critical, while remaining applicable to other deployments such as wide-area backbone networks. The framework initially targets notifications for link failures, signal degradation reported as link errors, and port queue congestion, while remaining extensible to additional conditions in the future. The specific actions a recipient takes in response to a notification (for example fast reroute, adaptive load balancing, or Song & Dong Expires 25 December 2026 [Page 3] Internet-Draft FANN Framework June 2026 rate adjustment) are out of scope of this framework; they are the responsibility of the consuming subsystem and the protocols that realize those actions. In this document, "fast" does not denote a single rigid numerical threshold. It characterizes a class of mechanisms designed to minimize notification delivery time so that the latency is on the order of microseconds to milliseconds, depending on the operational objective and the diameter of the notification domain, and is substantially shorter than the Round-Trip Time (RTT) of the affected traffic. This framework is solution-agnostic. It defines the functional roles, information model, and delivery and scoping models that a fast network notification solution is expected to instantiate, but it does not specify, mandate, or endorse any particular protocol, encoding, or solution document. It is intentionally general so that a range of realization approaches can conform to it, potentially in combination, without conflicting with one another or with this framework. Consistent with the FANN charter, fast generation and consumption in the forwarding plane (ideally in hardware) is the primary design point and the means of meeting the latency targets described above; consumption by the control plane or management plane is a secondary objective, permitted only where it preserves routing stability and does not compromise forwarding-plane responsiveness. Specific solutions are developed in separate documents; such documents are expected to map their behavior onto the roles and models defined here, and any capability they require that is not yet covered is expected to be accommodated as an extension of this framework rather than a departure from it. 1.2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Terminology This document uses the following terms. Fast Network Notification (FANN): An event-driven message that conveys a locally detected adverse network condition, or the recovery from such a condition, to one or more remote nodes within a defined notification domain, with the objective of low-latency delivery suitable for action in the forwarding plane. Song & Dong Expires 25 December 2026 [Page 4] Internet-Draft FANN Framework June 2026 Notification Originator: A functional role that detects an adverse condition (or its recovery) and generates a fast network notification. Also referred to as the originating or reporting node. Notification Consumer: A functional role that receives a fast network notification and may take action based on it. Also referred to as the recipient node. Notification Relay: A functional role that forwards or redistributes a notification toward additional consumers, optionally applying filtering, damping, or aggregation. A node may be both a relay and a consumer. Notification Domain: A bounded region of the network, under a single administrative control, within which fast network notifications are generated, distributed, and consumed, and outside of which they are not propagated without explicit policy. Notification Controller: An optional control-plane or management- plane entity that assists with discovery, registration, subscription, policy, and global optimization. It is not required to be on the fast notification delivery path. Event: A change in network condition detected by a node, such as a link failure, signal degradation, or output-queue congestion, including the recovery from a previously reported condition. The terms BFD [RFC5880], ECN [RFC3168], FRR [RFC4090] [RFC5714], and IOAM [RFC9197] are used as defined in their respective references. 3. Fast Network Notification Framework This chapter defines the core of the framework: its design principles, the deployment scenarios it serves, the functional reference architecture, the information carried in notifications, and the models for delivering, scoping, discovering, and controlling them. 3.1. Design Principles and Goals The framework is guided by the following principles, derived from the problem statement [I-D.ietf-rtgwg-net-notif-ps] and requirements [I-D.geng-fantel-fantel-requirements]. Event-driven, not periodic: Notifications are generated in response Song & Dong Expires 25 December 2026 [Page 5] Internet-Draft FANN Framework June 2026 to detected events. This distinguishes fast network notifications from preconfigured periodic mechanisms such as BFD [RFC5880], which detect rather than disseminate. Forwarding-plane optimized: The primary design point is fast generation and consumption in the forwarding plane, ideally in hardware, to meet responsiveness targets. Consumption by the control plane and management plane is a secondary objective and MUST NOT compromise routing stability. Lightweight and bounded: Notification messages are compact and the system is designed to bound the load it places on the network, especially during the very events it reports. The notification system MUST NOT exacerbate a failure or congestion event. Action-agnostic: A notification message conveys information; it does not mandate a specific reaction. A notification MAY explicitly indicate a recommended action, or the action MAY be determined implicitly by the consumer from the information carried. Extensible: The information model and event taxonomy are extensible to additional conditions, metrics, and scopes without redefining the architecture. Extensibility is equally a principle for the protocol design: the notification message encoding SHOULD allow new information elements, event types, and optional fields to be added in a forward-compatible way, such that a receiver MUST skip or ignore any element it does not understand rather than discarding the notification. This lets the set of carried information evolve over time without breaking the interoperability with existing implementations or requiring a new protocol version. Complementary: Fast network notifications complement, and do not replace, existing OAM, control-plane, and telemetry mechanisms. They bridge the time gap between event onset and slower control- plane or telemetry-driven responses. Scoped and isolated: Notifications are confined to a notification domain. Domain identification and isolation are first-class concerns. Decoupled from routing convergence: Fast-changing network state Song & Dong Expires 25 December 2026 [Page 6] Internet-Draft FANN Framework June 2026 SHOULD be conveyed by mechanisms that are separate from the routing protocol database and their own flooding and best-path computation, so that high-frequency or transient events do not introduce churn, instability, or excessive recomputation in the routing control plane. This separation lets notifications be generated and refreshed at a fast pace independently of routing convergence, while any consumption by routing remains a secondary objective bounded by the routing-stability constraint above. 3.2. Deployment Scenarios Fast network notifications apply across a range of network scenarios, but the time budget, processing constraints, and the mechanisms that are practical differ substantially between them. This framework does not assume one-size-fits-all: the scenarios below have materially different characteristics, and a given deployment is expected to select the delivery mode, scope, and realization mechanism appropriate to its scenario rather than apply a single mechanism everywhere. Intra-data-center fabric: Within a single DC fabric (for example a Clos topology), originators and consumers are typically a small number of hops apart, propagation delay is very low, and forwarding hardware can both detect and consume notifications. This scenario has the tightest time budget (sub-millisecond) and is the most amenable to forwarding-plane, in-band, or scoped- flooding delivery with action such as adaptive load balancing or local repair. The dominant challenge is volume and rate of change at scale rather than propagation distance. Single-hop DCI / point-to-point WAN: Between two sites or routers connected by one (logical) hop, the recipient set is small and often known in advance, favoring unicast or a directed notification to the upstream or ingress node. The time budget is dominated by the link propagation delay, which is fixed; the design goal is to add minimal processing delay on top of it so the notification still beats the affected traffic's E2E reaction loop. Multi-hop managed DCI interconnect: Data centers may also be Song & Dong Expires 25 December 2026 [Page 7] Internet-Draft FANN Framework June 2026 interconnected across multiple IP hops by a managed network, for example for AI collaborative computing and other DCI services. Unlike arbitrary WAN paths, this case is typically highly engineered: traffic follows deterministic, traffic-engineered (TE) paths, and network slicing can be used to isolate tenants. Because the path and the relevant upstream or ingress nodes are known in advance, notifications can be delivered either unicast or hop-by-hop along the path and scoped per slice or per tenant, and the resulting action can be applied at tenant or path granularity rather than only per physical link or node. Multi-hop / arbitrary WAN paths: When notifications must reach nodes several hops away or across a wider domain, propagation delay, the number of potential recipients, and the risk of notification storms all grow. Time budgets are typically milliseconds rather than sub-millisecond, and subscription, relaying with filtering/ aggregation, and bounded scoping become essential. Some timeliness targets achievable intra-DC may simply not be feasible here; the framework expects such cases to use more conservative techniques, and the feasibility of meeting a specific target is itself scenario-dependent. The functional roles (Section 3.3), information model (Section 3.4), and delivery modes (Section 3.5) defined in this document are common across scenarios, but their realization and the achievable latency are not. Where a requirement (for example a sub-millisecond target) is stated, it SHOULD be understood as scoped to the scenario in which it is feasible. 3.3. Reference Architecture 3.3.1. Functional Roles The framework defines four functional roles. A single physical or virtual network element MAY implement more than one role. Song & Dong Expires 25 December 2026 [Page 8] Internet-Draft FANN Framework June 2026 +-------------------------------------------------+ | Notification Domain | | | | [Detect] [Distribute] [Act] | | | | +----------+ +-----------+ +-----------+ | | |Originator|->| Relay |--->| Consumer | | | | (detect/ | | (forward/ | | (receive/ | | | | generate)| | filter/ | | action) | | | +----+-----+ | damp) | +-----+-----+ | | | +-----+-----+ | | | | | | | | v v v | | ........................................ | | : Notification Controller : | | : (discovery / registration / policy / : | | : global optimization) : | | ........................................ | +-------------------------------------------------+ Figure 1: FANN Functional Roles Within One Domain Notification Originator: Detects an event using local detection mechanisms (for example link fault detection, error counters, queue occupancy thresholds, or BFD [RFC5880] as a detection input) and generates a notification. The originator determines, by policy or signaling, the set of consumers and the delivery mode, and applies origination-side controls such as damping and rate limiting. Notification Relay: Receives a notification and forwards it toward additional consumers. A relay MAY filter, aggregate, deduplicate, or damp notifications. Relays enable hop-by-hop and scoped- flooding delivery and allow load to be bounded inside the domain. Notification Consumer: Receives a notification and may act on it in the data plane (for example rate adjusting, ECMP rebalancing, flow steering, traffic pause, etc.), and/or pass the information to the control plane or management plane. A consumer MAY also be a relay. Notification Controller: An optional entity that supports discovery, registration, subscription, and policy distribution, and that may consume notifications for global traffic-engineering or load- balancing optimization. The controller is not required to be on the fast delivery path and SHOULD NOT be a single point of failure for forwarding-plane reactions. Song & Dong Expires 25 December 2026 [Page 9] Internet-Draft FANN Framework June 2026 3.3.2. Notification Lifecycle A fast network notification proceeds through the following stages. 1. Detection. A node observes an event at the forwarding plane using a local detection mechanism. Detection mechanisms are out of scope of this framework, but their output is the trigger for notification generation. 2. Generation. The originator constructs a notification populated from the information model (Section 3.4), subject to origination policy, damping, and rate limiting. 3. Delivery. The notification is delivered to the intended consumers using one of the delivery modes (Section 3.5), possibly via one or more relays. 4. Consumption. A consumer parses the notification and decides whether and how to act, based on the information carried and on any local state it holds. 5. Action. The consumer performs an action (out of scope of this framework) and may relay the notification further. 6. Recovery and withdrawal. When the condition clears, the originator may generate a recovery notification so consumers can revert or update their state. Recovery notifications are subject to damping so that flapping conditions do not generate excessive traffic. 3.3.3. Detection Assumptions and Constraints The mechanisms by which a node detects an event are out of scope of this framework, but the framework assumes their existence and depends on their characteristics. Two assumptions are important. First, the E2E responsiveness of a fast notification system is bounded by detection time as well as delivery time: a notification cannot be faster than the moment the originating node becomes aware of the condition. Detection latency, accuracy, and false-positive behavior therefore directly shape what the notification system can achieve, and an event that is detected slowly or unreliably limits the value of fast delivery. Second, detection itself has a cost that interacts with scaling. For example, achieving fast liveness detection by running BFD [RFC5880] at very short transmit intervals consumes forwarding and control resources and does not by itself notify any node beyond the BFD Song & Dong Expires 25 December 2026 [Page 10] Internet-Draft FANN Framework June 2026 endpoints. Driving detection intervals down to obtain faster notification can impose significant load, and this trade-off between detection speed and detection cost SHOULD be considered together with the notification load discussed in Section 4.4. Where hardware can detect a condition directly (for example loss of signal, FEC errors, or queue-occupancy thresholds), it is generally preferable to detection mechanisms that rely on periodic message exchange such as BFD. The relevant distinction is between hardware-based and protocol-session-based detection in terms of speed and overhead, rather than between polling and non-polling as such: a hardware mechanism may itself poll internally, but its detection speed and per-event cost are typically far lower than those of a protocol session driven to an aggressive interval. 3.4. Information Model A fast network notification carries one or more information elements. For a given scenario some elements are mandatory and others optional; the framework does not require all elements in every notification. The detailed encoding is left to protocol specifications. The information elements are: Event Type: The class of event, for example failure, signal degradation, congestion, or performance degradation, and whether the notification reports onset or recovery. The event taxonomy is extensible. Location of Event: An identifier of where the event occurred, for example a link, node, interface, or queue identifier. Location identifiers SHOULD be interpretable by consumers within the notification domain. Fine-grained Network Status: Quantifiable metrics such as link utilization, available bandwidth, link capacity, queue length, level of congestion, link or node delay, jitter, and packet loss. Conveying such quantitative metrics, rather than a binary up/down indication, enables graduated and proportional responses such as weighted load-sharing adjustments. Path Identification: Identification of the path affected by the event, allowing consumers to scope their reaction to specific paths. Flow/Service Identification: Identification of an affected flow (for example a 5-tuple) or service, allowing differentiated, per-flow or per-service responses. Timing and Validity: Optional event timestamp and a validity or hold Song & Dong Expires 25 December 2026 [Page 11] Internet-Draft FANN Framework June 2026 time after which the reported condition should be considered stale absent refresh or recovery. Action Hint: An optional explicit indication of a recommended action. When absent, the consumer determines the action implicitly from the other elements. Origin and Sequence: Originator identity, which may be represented by the source IP address of the notification, and an optional sequence or epoch indicator. The sequence or epoch supports ordering, deduplication, and loop detection at relays and consumers; it is RECOMMENDED where notifications may be relayed, flooded, or reordered, and MAY be omitted in simple cases such as single-hop unicast delivery where such protection is unnecessary. A consistent information model across implementations is necessary for interoperability; defining the normative model and encodings is a task for the protocol specification. 3.5. Delivery and Scoping 3.5.1. Delivery Modes Depending on the position and number of consumers, the framework supports the following delivery modes. A scenario MAY use more than one. Unicast: Direct delivery to a single consumer. Suitable when the originator knows the specific node that must react (for example a designated ingress or upstream node). Multicast / Point-to-Multipoint: Delivery to a selected group of consumers, for example along a service or forwarding path. Suitable when a defined set of nodes must react together. Hop-by-hop: Delivery along a series of nodes on a specified path, with each node acting as a relay and possibly a consumer. Suitable for propagating awareness upstream along an affected path. Scoped Flooding: Dissemination to all nodes within a bounded region of the domain. Suitable for critical events with many interested consumers, with special attention to control overhead and duplicate suppression. Song & Dong Expires 25 December 2026 [Page 12] Internet-Draft FANN Framework June 2026 3.5.2. Transport Considerations Delivery MAY reuse existing messaging and transport mechanisms or a new lightweight mechanism MAY be defined where existing ones cannot meet the latency or forwarding-plane processing targets. Regardless of the underlying transport, the delivery mechanism is responsible for timely delivery to the intended consumers and for bounding the load it introduces. Because notifications are most valuable precisely when the network is under stress, the transport MUST support prioritization so that notifications are not delayed or dropped behind the very congestion they report. A notification that is queued behind the congested traffic loses most of its value. Prioritization can be realized using existing forwarding-plane mechanisms, including: * DiffServ marking, for example a dedicated DSCP [RFC2474] code point mapped to a high-priority or low-latency per-hop behavior (for example Expedited Forwarding [RFC3246]) along the notification path, so that classification and queuing of notifications can be done in hardware at every hop; * a strict-priority or low-latency queue, or a dedicated control- class queue, separated from user-data queues so notifications bypass congested data queues; * at Layer 2, priority marking such as IEEE 802.1p / PCP where the delivery path traverses bridged segments. The chosen marking and per-hop behavior MUST be consistent across the notification domain so that priority is honored E2E within the domain. Operators MUST be able to configure the marking, and the markings used for notifications SHOULD be reserved so that ordinary traffic cannot claim the same priority and so that notification traffic itself cannot be abused to obtain preferential treatment (Section 5). Because notifications occupy a high-priority class, their volume MUST be bounded by the rate limiting, damping, and filtering of Section 3.7 to avoid starving other control traffic. Reliability requirements vary by scenario: some events warrant best- effort, low-latency delivery, while others (for example recovery state) may warrant acknowledgement or periodic refresh. 3.5.3. Notification Domain and Isolation Fast network notifications are confined to a notification domain. The framework requires mechanisms to: Song & Dong Expires 25 December 2026 [Page 13] Internet-Draft FANN Framework June 2026 * identify a notification domain and its membership; * ensure notifications are not propagated outside the domain without explicit policy; * prevent notifications from one domain being injected into or trusted by another. Domain scoping bounds the blast radius of both legitimate notification storms and malicious injection, and it aligns the trust boundary with the single administrative control assumed by the charter. 3.6. Discovery, Registration, and Subscription To deliver notifications only to interested and authorized consumers, the framework supports the following procedures. A deployment MAY use configuration, dynamic signaling, or a combination. Discovery: Originators and consumers determine the existence, identity, and capabilities (event types, encodings, delivery modes) of relevant peers and relays within the domain. Registration: A node registers as a potential originator or consumer within the domain, establishing the trust and addressing state needed for delivery. Subscription: A consumer expresses interest in specific event types, locations, paths, flows, or metric thresholds. A subscription- based approach ensures each consumer receives only relevant information, reducing unnecessary overhead. Subscriptions MAY be brokered by a controller or established directly between nodes. These procedures MAY be realized by reusing existing protocols where appropriate, or by new mechanisms defined in the protocol specification work. 3.7. Loop Prevention, Filtering, and Damping Because relays may forward notifications and consumers may relay further, the solution MUST provide for: Loop prevention: Use of origin identity, sequence/epoch indicators, scope limits (for example a hop or region bound), and duplicate suppression so that a notification does not circulate indefinitely. Filtering and aggregation: Relays MAY filter notifications that are Song & Dong Expires 25 December 2026 [Page 14] Internet-Draft FANN Framework June 2026 not relevant to downstream consumers and MAY aggregate multiple related events to reduce volume. Damping: The solution MUST define where responsibility lies for handling rapidly changing conditions, such as a flapping link. Damping MAY be applied at the originator, at a transit relay, or required to reach the consumer; the chosen location and its controls MUST be specified explicitly. A common policy is to report a degradation immediately but to delay reporting the corresponding recovery for a configurable interval to confirm stability. 4. Realization and Operational Considerations This chapter describes how the framework relates to existing technologies, the candidate mechanisms that could realize it, the applications it enables, and the scaling and operational considerations that apply when deploying it. It is informational and does not mandate any particular realization. 4.1. Integration with Existing Technologies A central goal of the framework is integration with existing mechanisms across layers, as required by the charter. Fast network notifications are complementary to these mechanisms. Layer 2: Link-layer fault and error detection (for example physical- layer alarms, FEC error counters, and interface error statistics) are detection inputs to the originator. Layer 2 protection may act as a consumer's response. Layer 3 / Routing: Fast network notifications complement IGP/BGP- based dissemination and FRR [RFC4090] [RFC5714]. Whereas a Point of Local Repair acts on a local topology view and may cause congestion on a backup path, a notification can give upstream nodes a wider view before they react. Consumption by routing protocols is a secondary objective and MUST preserve routing stability; notifications MUST NOT be allowed to induce control- plane churn or instability. Topology and inventory models such as [RFC8345] may provide context for interpreting location and path identifiers. Layer 4 / Transport: ECN [RFC3168] signals congestion to the transport sender only coarsely and over a full RTT. Fast network notifications can deliver richer congestion information to network nodes far sooner, but they then act inside the network while end- to-end transport congestion control (TCP, QUIC, or RDMA/RoCE) acts at the endpoints, so the two loops run concurrently on the same Song & Dong Expires 25 December 2026 [Page 15] Internet-Draft FANN Framework June 2026 traffic. A solution MUST ensure its actions remain a net-positive complement to transport: it SHOULD preserve per-flow ordering and avoid abrupt RTT or capacity changes where feasible, and SHOULD NOT suppress or rewrite end-to-end congestion signals such as ECN marks. Where the interaction cannot be shown to be benign, a conservative reaction is preferred; detailed coordination between network-side and transport-side reactions is for further study with the relevant transport working groups. Detection and OAM: BFD [RFC5880] provides fast bidirectional fault detection between endpoints but does not notify other nodes; it can serve as a detection input to the originator. Obtaining faster detection by shortening BFD transmit intervals increases resource consumption, as discussed in Section 3.3.3. IOAM [RFC9197], the Alternate-Marking (AM) [RFC9341], and IPFIX [RFC7011] provide detailed data-plane measurements but are not designed for lightweight, rapid alerts to specific nodes for immediate action. Performance metrics may be defined consistently with [RFC7799]. Fast network notifications fill this gap and feed, rather than replace, telemetry pipelines. The interaction with each technology, including any required protocol extensions, is expected to be developed in the relevant IETF working groups. 4.2. Candidate Realization Approaches This section surveys, non-normatively, classes of mechanism that could realize fast network notifications. It does not endorse a specific approach; the choice depends on the deployment scenario (Section 3.2), and a solution MAY combine more than one. Advertisement of link/path status to neighbors, decoupled from the IGP: A node advertises the up/down status and quality of its links or paths to neighboring nodes, separately from the IGP and its link-state database, so that fast or frequent updates do not perturb routing convergence (see the decoupling principle in Section 3.1). [I-D.zzhang-rtgwg-router-info] illustrates this approach; note that what is advertised is link/path reachability (up/down) and quality, which is distinct from the node and link state that an IGP floods. It suits cases where the consumers are routing or forwarding elements that benefit from a wider view without incurring IGP churn. IGP/BGP protocol extensions: Existing control-plane protocols could be extended to carry notification information. This reuses deployed machinery but must be weighed carefully against the routing-stability and overhead concerns in Section 4.1, and is Song & Dong Expires 25 December 2026 [Page 16] Internet-Draft FANN Framework June 2026 generally better suited to slower-changing or control-plane- consumed information than to the fastest forwarding-plane reactions. In-band / data-plane signaling: Notifications are carried in the forwarding plane, for example in packet headers or lightweight dedicated packets, so that detection, delivery, and consumption can occur in hardware. This offers the lowest latency and best matches the intra-DC scenario, at the cost of requiring forwarding-hardware support and careful scoping. The need for such forwarding-plane notification in AI data center fabrics is motivated by [I-D.clad-rtgwg-ipfrr-aiml], which analyzes the limitations of existing IP-FRR in these fabrics and the requirements for its enhancement. As examples of the notification mechanisms, [I-D.camarillo-rtgwg-lsn] defines a fast notification protocol that operates above the Ethernet layer, and [I-D.csaszar-rtgwg-ipfrr-fn] proposed fast-notification-based IP- FRR optimization over a decade ago, with the companion [I-D.lu-fn-transport] defining a data-plane transport and message container for the notifications themselves. Tunnel- or overlay-based delivery in the WAN: For multi-site or WAN deployments, notifications may be delivered over established tunnels or overlays toward ingress or upstream nodes; work such as [I-D.hzh-fantel-wan-tunnel] explores fast notification in this context for tunnel-based transport. Telemetry-assisted collection toward the traffic source: Rather than pushing an alert outward from the detecting node, path latency and congestion state are accumulated in-band along the path and returned to the traffic source, so that the source obtains fresh path state and can steer traffic or adjust its congestion response. FALCON [I-D.song-rtgwg-falcon] realizes this by combining in-network telemetry with source routing and collecting the data on the reverse path toward the source, reducing the feedback lag to less than half the baseline RTT; it applies to both DCN and WAN and reuses existing IETF mechanisms. Path- and slice-scoped flow control and backpressure: Congestion or available-bandwidth information is notified to upstream nodes along the forwarding path and acted upon as flow control, scoped per path segment or per slice so that control applies at tenant or task granularity. [I-D.liu-rtgwg-srv6-cc] uses SRv6 segments and slicing to throttle specific flows for lossless transmission, and [I-D.han-rtgwg-fine-grained-backpressure] extends Layer 2 PFC into the WAN with hop-by-hop backpressure messages and slice-based isolation. This approach suits the managed multi-hop DCI scenario of Section 3.2. Song & Dong Expires 25 December 2026 [Page 17] Internet-Draft FANN Framework June 2026 Each approach trades off latency, hardware dependence, protocol reuse, and impact on routing stability differently, and fits some scenarios in Section 3.2 better than others. Coordination when multiple recipients act on the same notification is out of scope and for further study. 4.3. Illustrative Applications This section sketches, non-normatively, applications that fast network notifications enable. The actions themselves are out of scope (Section 1.1); they illustrate what the information in Section 3.4 makes possible. Upstream (remote) protection: On a link or node failure notification, a node several hops upstream activates a pre- computed backup path instead of relying only on local repair, avoiding the hairpinning that purely local alternates (LFA or TI- LFA) can introduce. Efficient Remote Protection [I-D.clad-rtgwg-efficient-remote-protection] describes such a mechanism and applies both to failures and to degradations such as reduced capacity or congestion. Fast protection in AI/ML fabrics: AI/ML fabrics need convergence within tens of microseconds, requiring notification and reaction within the forwarding plane without CPU intervention. [I-D.clad-rtgwg-ipfrr-aiml] analyzes the limitations of existing IP-FRR in such fabrics and the requirements for fast, forwarding- plane protection. Graduated load-sharing and flow control: When a notification carries fine-grained status (utilization, available bandwidth, or capacity degradation) rather than a binary up/down, an upstream node can rebalance load-sharing weights in proportion to the reported severity, or apply per-flow, per-tenant, or per-slice flow control and backpressure toward the congestion point instead of the coarse, port-level pausing of link-layer PFC. Both responses are graduated and capacity-aware, shifting or throttling traffic gradually rather than all at once; the realizing mechanisms are surveyed in Section 4.2. 4.4. Scaling Considerations The solution must remain effective as the network grows. Scaling pressure arises from network size (the number of nodes and links that may report events), the volume and rate of change of reported information, and the number of consumers. The design assumption is that if anything can go wrong it will, so the system must cope with a high proportion of nodes and links reporting simultaneously. Song & Dong Expires 25 December 2026 [Page 18] Internet-Draft FANN Framework June 2026 The framework addresses scale through subscription (delivering only relevant information), scoping and domain isolation (bounding propagation), relay-based filtering and aggregation, damping of rapidly changing conditions, and transport prioritization and rate limiting. Protocol specifications SHOULD quantify the load their mechanisms place on the forwarding and control planes under worst- case event conditions. 4.5. Operational Considerations Fast network notifications introduce additional traffic. During the failures and congestion events they report, the notification system MUST NOT exacerbate the situation and SHOULD actively assist in mitigating it. Operators SHOULD be able to configure which event types trigger notifications, the delivery modes and scopes used, damping and rate-limiting parameters, and prioritization, so that notification behavior aligns with network operation policies. Management and configuration of the solution are expected to be supported by YANG modules, to be defined as a separate deliverable consistent with the charter. Manageability includes observability of the notification system itself (counts, drops, damping events) so operators can verify it is helping rather than harming. 5. Security Considerations If not properly authenticated and rate-limited, fast network notifications could be a denial-of-service vector: an attacker that injects or floods spurious notifications could trigger unnecessary re-convergence, path changes, or repeated state updates, and could induce state flapping to keep an originator busy. Notifications may also reveal sensitive operational information, whether by inspection or by an adversary registering as a consumer. Accordingly, solutions built on this framework MUST provide integrity protection and origin authentication of notifications, MUST apply rate controls on both sending and receiving, and MUST address trust boundaries around domains and subscriptions, authorization of notification sources, and protection of sensitive operational data. Because stronger security can add latency, the trade-off between notification latency and security strength is considered per scenario. Domain identification and isolation (Section 3.5) are central to confining notifications to the trusted administrative boundary. The charter's restriction to a single administrative control reduces, but does not eliminate, the threat surface. Because the operator controls every originator, relay, and consumer and the trust boundary Song & Dong Expires 25 December 2026 [Page 19] Internet-Draft FANN Framework June 2026 coincides with the notification domain (Section 3.5), the boundary can drop notifications arriving from outside it (constraining external injection and spoofing), exposure of operational data to third parties is bounded, and trust for discovery, registration, and subscription can reuse existing intra-domain infrastructure. This lets the design favor lightweight, low-latency mechanisms internally while concentrating stronger enforcement at the domain boundary. This assumption does not remove the need for in-domain protection: insider threats, compromised or malfunctioning nodes, and the self- inflicted denial of service of a flapping link all originate inside the boundary. The requirements above therefore still apply within the domain, and the single-administrative-control premise should be treated as defense in depth rather than a substitute for them. 6. IANA Considerations This document requires no IANA actions. 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . 7.2. Informative References [I-D.camarillo-rtgwg-lsn] Camarillo, P., Filsfils, C., Chachmon, N., Iny, O., Su, Y., and R. Jiang, "Lightspeed Notification Protocol", Work in Progress, Internet-Draft, draft-camarillo-rtgwg-lsn-00, 2 March 2026, . [I-D.clad-rtgwg-efficient-remote-protection] Clad, F., Filsfils, C., Su, Y., and D. Cai, "Efficient Remote Protection", Work in Progress, Internet-Draft, draft-clad-rtgwg-efficient-remote-protection-00, 2 March 2026, . Song & Dong Expires 25 December 2026 [Page 20] Internet-Draft FANN Framework June 2026 [I-D.clad-rtgwg-ipfrr-aiml] Clad, F., Filsfils, C., Jiang, R., and D. Cai, "IP Fast Reroute for AI/ML Fabrics", Work in Progress, Internet- Draft, draft-clad-rtgwg-ipfrr-aiml-00, 2 March 2026, . [I-D.csaszar-rtgwg-ipfrr-fn] Csaszar, A., Envedi, G. S., Tantsura, J., Kini, S., Sucec, J., and S. Das, "IP Fast Re-Route with Fast Notification", Work in Progress, Internet-Draft, draft-csaszar-rtgwg- ipfrr-fn-01, 25 February 2013, . [I-D.geng-fantel-fantel-gap-analysis] Geng, X., Dong, J., Cheng, W., Li, D., Zhu, Y., and H. Zhengxin, "Gap Analysis of Fast Notification for Traffic Engineering and Load Balancing", Work in Progress, Internet-Draft, draft-geng-fantel-fantel-gap-analysis-02, 26 February 2026, . [I-D.geng-fantel-fantel-requirements] Geng, X., Dong, J., Zhu, Y., Li, D., Cheng, W., and C. Liu, "Requirements of Fast Notification for Traffic Engineering and Load Balancing", Work in Progress, Internet-Draft, draft-geng-fantel-fantel-requirements-03, 26 February 2026, . [I-D.han-rtgwg-fine-grained-backpressure] Zhengxin, H., Ruan, Z., Pang, R., Yue, Y., Yao, J., and Q. Xiong, "Fine-Grained Flow Control Backpressure Mechanism for Wide Area Networks", Work in Progress, Internet-Draft, draft-han-rtgwg-fine-grained-backpressure-02, 7 June 2026, . [I-D.hzh-fantel-wan-tunnel] Hu, Z., Zhu, Y., Hu, J., and T. Pi, "Fast Notification for tunnel-based lossless RDMA transmission in WAN", Work in Progress, Internet-Draft, draft-hzh-fantel-wan-tunnel-02, 1 March 2026, . Song & Dong Expires 25 December 2026 [Page 21] Internet-Draft FANN Framework June 2026 [I-D.ietf-rtgwg-net-notif-ps] Dong, J., McBride, M., Clad, F., Zhang, Z. J., Zhu, Y., Xu, X., Zhuang, R., Pang, R., Lu, H., Liu, Y., Contreras, L. M., Mehmet, D., and R. Rahman, "Fast Network Notifications Problem Statement", Work in Progress, Internet-Draft, draft-ietf-rtgwg-net-notif-ps-02, 7 May 2026, . [I-D.liu-rtgwg-srv6-cc] Liu, Y., Yao, J., Lin, C., and X. Min, "Congestion Control Based on SRv6 Path", Work in Progress, Internet-Draft, draft-liu-rtgwg-srv6-cc-01, 27 February 2026, . [I-D.lu-fn-transport] Lu, W., Kini, S., Csaszar, A., Envedi, G. S., and J. Tantsura, "Transport of Fast Notification Messages", Work in Progress, Internet-Draft, draft-lu-fn-transport-05, 19 August 2013, . [I-D.song-rtgwg-falcon] Song, H., Wan, Y., and K. Zhu, "Fast Latency and Congestion Notification", Work in Progress, Internet- Draft, draft-song-rtgwg-falcon-00, 6 April 2026, . [I-D.zzhang-rtgwg-router-info] Zhang, Z. J., Wang, K., Lin, C., Vaidya, N., Tantsura, J., and Y. Liu, "Advertising Router Information", Work in Progress, Internet-Draft, draft-zzhang-rtgwg-router-info- 06, 23 February 2026, . [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers", RFC 2474, DOI 10.17487/RFC2474, December 1998, . [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . Song & Dong Expires 25 December 2026 [Page 22] Internet-Draft FANN Framework June 2026 [RFC3246] Davie, B., Charny, A., Bennet, J.C.R., Benson, K., Le Boudec, J.Y., Courtney, W., Davari, S., Firoiu, V., and D. Stiliadis, "An Expedited Forwarding PHB (Per-Hop Behavior)", RFC 3246, DOI 10.17487/RFC3246, March 2002, . [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, DOI 10.17487/RFC4090, May 2005, . [RFC5714] Shand, M. and S. Bryant, "IP Fast Reroute Framework", RFC 5714, DOI 10.17487/RFC5714, January 2010, . [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection (BFD)", RFC 5880, DOI 10.17487/RFC5880, June 2010, . [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, DOI 10.17487/RFC7011, September 2013, . [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, May 2016, . [RFC8345] Clemm, A., Medved, J., Varga, R., Bahadur, N., Ananthakrishnan, H., and X. Liu, "A YANG Data Model for Network Topologies", RFC 8345, DOI 10.17487/RFC8345, March 2018, . [RFC9197] Brockners, F., Ed., Bhandari, S., Ed., and T. Mizrahi, Ed., "Data Fields for In Situ Operations, Administration, and Maintenance (IOAM)", RFC 9197, DOI 10.17487/RFC9197, May 2022, . [RFC9341] Fioccola, G., Ed., Cociglio, M., Mirsky, G., Mizrahi, T., and T. Zhou, "Alternate-Marking Method", RFC 9341, DOI 10.17487/RFC9341, December 2022, . Song & Dong Expires 25 December 2026 [Page 23] Internet-Draft FANN Framework June 2026 Acknowledgements Authors' Addresses Haoyu Song Futurewei Technologies Email: hsong@futurewei.com Jie Dong Huawei Email: jie.dong@huawei.com Song & Dong Expires 25 December 2026 [Page 24]