Internet-Draft Precise Priority-based Flow Control Noti June 2026
Xiong & Zhu Expires 14 December 2026 [Page]
Workgroup:
tsvwg
Internet-Draft:
draft-xz-rtgwg-ppfc-notification-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
Q. Xiong
ZTE Corporation
X. Zhu
ZTE Corporation

Precise Priority-based Flow Control Notification

Abstract

This document specifies the notification mechanism for Precise Priority-based Flow Control (PPFC), defining the message formats and network actions for communicating congestion information among network nodes. The PPFC notification enables rapid congestion signaling, allowing per-flow flow control for traffic from source.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 14 December 2026.

Table of Contents

1. Introduction

Driven by the rapid development of big data and artificial intelligence (AI) technologies, the demand for high-performance and low-latency data transmission has become critical across various industries. Data transfers typically rely on transport layer protocols such as the Transmission Control Protocol (TCP), Quick UDP Internet Connections (QUIC), or Remote Direct Memory Access over Converged Ethernet (RoCEv2). These protocols primarily employ end-to-end congestion control algorithms, which adjust sending rates based on network status feedback by manipulating congestion windows. Traditional end-to-end congestion control mechanisms depend on signals such as:

A fundamental limitation of these end-to-end approaches is that the congestion feedback path is coupled with the data return path. When congestion occurs, the notification must traverse the reverse path to the sender, incurring at least one RTT of delay. In long-distance or highly dynamic networks, this delay prevents timely rate adjustment, leading to prolonged congestion, increased latency, and packet loss.

1.1. Motivation and Overview

Precise Priority-based Flow Control (PPFC) can address this limitation by enabling per-flow control to instantly pause the traffic sources based on explicit notifications from the network. The proposed PPFC mechanism operates as follows: When a network node (e.g., a switch or router) detects congestion it can generate a PPFC Notification message. This message is sent to the ingress gateway(s) from which the congested flows originated. The notification contains key information such as:

  • The congested destination identifier (e.g., IP address).

  • A recommended flow control action (e.g., PAUSE) and its duration.

  • Optional context about the congestion severity.

This document specifies the notification mechanism for PPFC, defining the message formats and network actions for communicating congestion information. The PPFC notification enables rapid congestion signaling, allowing per-flow flow control for traffic from source.

2. Conventions Used in This Document

2.1. Abbreviations

RTT: Round-Trip Time

TCP: Transfer Control Protocol

RDMA: Remote Direct Memory Access Round-Trip Time

QUIC: Quick UDP Internet Connections

ECN: Explicit Congestion Notification

PFC: Priority-based Flow Control

PPFC: Precise Priority-based Flow Control

RoCEv2: RDMA over Converged Ethernet version 2

QP: Queue Pair

2.2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. PPFC Notification

When the per-flow congestion occurs at Node Y, it will trigger the PPFC notification in ICMP or UDP message to the ingress or the gateway or the client in RoCEv2 message. This document defines the message formats and network actions for communicating congestion information to the ingress or the gateway.

 PPFC Notification in RoCEv2           PPFC Notification
         **************************************************
         *                    *                           *
         *                    *                           *
         *                    *                           *
         V                    V                           *
    +--------+ Traffic     +-------+     +-------+     +---+---+     +-------+
    |Client A|<----------->|Gateway|<--->|Node X |<--->|Node Y |<--->|Server |
    +--------+      +----->+-------+     +-------+     +-------+     +-------+
             Traffic|                                 Congestion
    +--------+      |                                   Occurs
    |Client B+<-----+
    +--------+

                         Figure 1: PPFC Notification

4. PPFC Notification Message Formats

4.1. ICMP Message Format

The PPFC Notification message sent from gateway may be a ICMP message which is formatted as Figure 2.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type = TBD1   |  Code = TBD2  |           Checksum            |
   +------------+--+---------------+-------------------------------+
   |     Flags  |PT|    Priority   |       Congestion Port         |
   +------------+--+---------------+-------------------------------+
   |       Reserved                |       Pause Duration          |
   --------------------------------+-------------------------------+
   ~                    Source Address                             ~
   ----------------------------------------------------------------+
   ~                    Destination Address                        ~
   ----------------------------------------------------------------+
   ~                    Router Identifier                          ~
   ----------------------------------------------------------------+
   |                    Flow Identifier                            |
   --------------------------------+-------------------------------+

           Figure 2: PPFC Notification Message Format with ICMP
Where:
  • Type and Code: TBD 1 (8bits) and TBD2 (8bits), the fields indicate the PPFC notification type and code.

  • PT (2bits): indicates the action type of the PPFC notification for the ingress to take actions of traffic. It can be set to 00 "stop", 01 "resume", 10 "alarm", 11 "hold".

  • Priority (8bits): Corresponds to IP DSCP or flow priority.

  • Router Identifier (variable): indicates IPv4 or IPv6 address of the congested node.

  • Congestion Port (16bits): Identifier of the congested port.

  • Pause Duration (16bits): Recommended pausing interval, in microseconds.

  • Source Address (variable): indicates the IPv4 or IPv6 address of the sender.

  • Destination Address (variable): indicates the IPv4 or IPv6 address of the receiver.

  • Flow Identifier (variable): It indicates the IP 5 tuples when transmitting TCP or QUIC data and it also can be mapped at ingress to source and destination QP when transmitting RDMA data.

4.2. UDP Message Format

The PPFC Notification message sent from gateway may be a UDP message which is formatted as Figure 3.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        UDP Source Port        |  UDP Destination Port = TBD3  |
   +------------+--+---------------+-------------------------------+
   |     Flags  |PT|    Priority   |       Congestion QID          |
   +------------+--+---------------+-------------------------------+
   |       Reserved                |       Pause Duration          |
   --------------------------------+-------------------------------+
   ~                    Source Address                             ~
   ----------------------------------------------------------------+
   ~                    Destination Address                        ~
   ----------------------------------------------------------------+
   |                    Flow Identifier                            |
   --------------------------------+-------------------------------+

        Figure 3: PPFC Notification Message Format with UDP

Where:

  • UDP Header: The UDP header as specified in [RFC768] includes the UDP source port, UDP destination port, UDP length, and UDP checksum.

  • UDP Destination port: TBD3, indicates a new well-known UDP destination port needs to be allocated for this PPFC notification message.

  • The other fields are identical to the fields defined in Section 4.1.

5. Security Considerations

To be discussed in future versions of this document.

6. IANA Considerations

This document requests IANA to allocate a new ICMP message type and UDP port.

7. Normative References

[RFC768]
Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 10.17487/RFC0768, , <https://www.rfc-editor.org/rfc/rfc768>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.

Authors' Addresses

Quan Xiong
ZTE Corporation
Xiangyang Zhu
ZTE Corporation