RTGWG Q. Xiong Internet-Draft C. Gao Intended status: Informational ZTE Corporation Expires: 4 January 2025 Z. Han China Unicom G. Zhao China Mobile W. Qu China Telecom 3 July 2024 Requirements for High-performance Wide Area Networks draft-xiong-rtgwg-requirements-hp-wan-00 Abstract Many applications such as big data and intelligent computing demand massive data transmission between data centers, which needs to ensure data integrity and provide stable and efficient transmission services in wide area networks and metropolitan area networks. This document outlines the requirements for High-performance Wide Area Networks (HP-WAN). Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 4 January 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. Xiong, et al. Expires 4 January 2025 [Page 1] Internet-Draft Requirements for High-performance Wide A July 2024 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Primary Goals . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Extremely Low or Zero Packet Loss Ratio . . . . . . . . . 4 3.2. Low Long-distance Delay and Jitter . . . . . . . . . . . 4 3.3. Ultra-high Bandwidth Utilization . . . . . . . . . . . . 5 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 5 4.1. Support High-precision Flow Control . . . . . . . . . . . 5 4.2. Support Congestion Control based on End-network Coordination . . . . . . . . . . . . . . . . . . . . . . 6 4.3. Support Muti-path Load Balance . . . . . . . . . . . . . 6 4.4. Support the Differentiated Traffic Scheduling . . . . . . 7 4.5. Support Flow-based Network Monitoring . . . . . . . . . . 7 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 8 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 8 8.1. Normative References . . . . . . . . . . . . . . . . . . 8 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 9 1. Introduction Big data and intelligent computing is undergoing rapid development. There are many applications requiring massive data transmission between data centers, which need to ensure data integrity and provide stable and efficient transmission services in Wide Area Networks (WAN) and Metropolitan Area Networks (MAN). The use cases have been discussed in [I-D.xiong-rtgwg-requirements-hp-wan]. The industries need to solve the problems such as long distance, slow feedback, multiple paths, load balance, low throughput and so on. Compared with ordinary WAN, High-performance Wide Area Networks (HP- WAN) puts forward higher performance requirements such as ultra-high bandwidth utilization, and ultra-low packet loss ratio ensuring effective high-throughput transmission. The topology in HP-WAN is complicated with long distances, multiple hops, paths, domains and Xiong, et al. Expires 4 January 2025 [Page 2] Internet-Draft Requirements for High-performance Wide A July 2024 the services are massive and concurrent with multiple types and different traffic models such as the elephant flows with short interval time, high speed and large data scale. The network requirements demand high performance such as the high- throughput data transmission between data centers. It is viewed as the main performance indicator which is affected by long-distance delays, jitter and packet loss ratio. For example, the massive data transmission between data centers mainly depend on the transport layer protocols such as Transfer Control Protocol (TCP), Remote Direct Memory Access (RDMA) and Quick UDP Internet Connections (QUIC) etc. The throughput will dramatically decrease when the packet loss ratio is over a threshold value. Extremely low packet loss ratio or even zero packet loss will greatly reduce the bandwidth resource consumption caused by packet loss retransmission. Existing technologies in data centers, e.g. Priority-based Flow Control (PFC) [IEEE 802.1qbb] and Explicit Congestion Notification (ECN) [RFC3168], have problems due to various service types, massive data, large burst, and high Round-Trip Time (RTT) latency and jitter in large-scale networks. It will be challenging to achieve high- throughput transmission in HP-WAN. This document outlines the requirements for High-performance Wide Area Networks (HP-WAN). 1.1. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here. 2. Terminology The terminology is defined as following. High-performance Wide Area Networks (HP-WAN): indicates the WAN or MAN which puts forward higher performance requirements such as ultra- high bandwidth utilization, and ultra-low packet loss ratio ensuring effective high-throughput transmission. Abbreviations and definitions used in this document: PFC: Priority Flow Control ECN: Explicit Congestion Notification Xiong, et al. Expires 4 January 2025 [Page 3] Internet-Draft Requirements for High-performance Wide A July 2024 ECMP: Equal-Cost Multipath RTT: Round-Trip Time TCP: Transfer Control Protocol RDMA: Remote Direct Memory Access Round-Trip Time QUIC: Quick UDP Internet Connections WAN: Wide Area Networks MAN: Metropolitan Area Networks 3. Primary Goals The primary goal of HP-WAN is to ensure the effective high-throughput transmission of massive data with the performance indicators such as ultra-high bandwidth utilization, zero packet loss ratio and low latency. For instance, the computing method of throughput for TCP is as following shown. Throughput = min{BW,WindowSize/RTT,(MSS/RTT)*(1/P)); BW indicates the maximum bandwidth, WindowSize indicates the size of the window, MSS indicates the maximum segment size, RTT indicates the round time delay, P indicates the square root of packet loss ratio. 3.1. Extremely Low or Zero Packet Loss Ratio According to the throughput computing formula, the packet loss negatively correlates with throughput. The lower the packet loss rate, the higher the throughput. According to the experimental data, for TCP, the throughput dramatically decreases up to 89.9% when the packet loss ratio is 2%. For RDMA, the throughput dramatically decreases with a packet loss ratio greater than 0.1%, and a 2% packet loss ratio effectively reduces the throughput to zero. It is important to ensure the extremely low or zero packet loss ratio to achieve high-throughput data transmission in HP-WAN. 3.2. Low Long-distance Delay and Jitter According to the throughput computing formula, the RTT is negatively correlated with throughput. The lower the RTT, the higher the throughput. But the RTT delay is impacted with the long-distance latency. For example, when the distance between the data centers is 500 kilometers, the RTT is 5ms. When the distance is 3000 kilometers, the RTT is 30ms. According to the experimental data, Xiong, et al. Expires 4 January 2025 [Page 4] Internet-Draft Requirements for High-performance Wide A July 2024 when the jitter is over 300~500us, the throughput will dramatically decrease. So it is required to guarantee low long-distance delay and jitter to achieve high-throughput data transmission in HP-WAN. 3.3. Ultra-high Bandwidth Utilization It is important to reserve sufficient bandwidth to achieve high- throughput transmission for a single flow. But for massive concurrent flows in a network with certain resources, bandwidth utilization is the key aspect. Ultra-high bandwidth utilization refers to the efficient use of available network capacity to maximize data transfer rates and minimize latency. This is particularly important in scenarios where high volumes of data need to be transferred quickly. So it is required to improve the bandwidth utilization to achieve high-throughput data transmission for multiple concurrent services in HP-WAN. 4. Requirements Challenges of high-throughput transmission in HP-WAN come from massive concurrent services and long-distance delays, jitter and packet loss. The existing network technologies have various problems and cannot meet the demands. This document outlines the requirements for high-throughput data transmission in HP-WAN. 4.1. Support High-precision Flow Control Flow control refers to a method for ensuring the data is transmitted efficiently and reliably and controlling the rate of data transmission to prevent the fast sender from overwhelming the slow receiver and prevent packet loss in congested situations. PFC (Priority-based Flow Control) [IEEE 802.1qbb] is a hop-by-hop and priority-based flow control method which provides backpressure mechanism for the receiver signals the sender to slow down the rate of data transmission. For the long-distance link and transmission delay in WAN, it is required to configure the reasonable threshold and increase buffer for effective throughput without packet loss. PFC creates 8 virtual channels on a link and assigns a priority to each channel, allowing for individual pause and restart of any one of the virtual channels. The existing flow control mechanism is based on port and priority with limited number, while there are multiple services with various types and different traffic requirements. It is required to provide fine-grained and high-precision flow control to reduce the impact between different traffic flows. Xiong, et al. Expires 4 January 2025 [Page 5] Internet-Draft Requirements for High-performance Wide A July 2024 4.2. Support Congestion Control based on End-network Coordination Congestion control refers to a method for controlling the total amount of data entering the network to maintain the traffic at an acceptable level. The difference between congestion control and flow control is that flow control acts on the receiver, while congestion control acts on the network. As per [RFC3168], ECN defines an end- to-end congestion notification mechanism based on IP and transport layers. When the congestion occurred, the device marks packets and transmits congestion information to the server and the server sends packets to the client to notify the source to adjust the transmission rate to achieve congestion control. The long-distance transmission of thousands of kilometers results in extremely long link transmission delays and it will delay the network state feedback. And it is inefficient that 1-bit ECN signal can not specify the detailed congestion information. And it mainly relies on passive congestion control to adjust the rate after receiving congestion signals. It is required to improve the congestion control by enhancing the IP network capability to achieve the end and network coordination in WAN. For example, the device could initial the notification directly to the source and provide precise notification information. And the device in the network may further perform the proactive congestion control. 4.3. Support Muti-path Load Balance Load balance refers to a method for the allocation of load (traffic) to multiple links for forwarding traffic. When transmitting intelligent computing services, the traffic is mainly elephant flow and the network resources is insufficient in WAN. Uneven network load will lead to a decrease in network throughput and low link utilization. In order to improve bandwidth utilization, it is required to implement multi-path load balance to achieve low latency, zero packet loss and high-throughput performance in WAN. Xiong, et al. Expires 4 January 2025 [Page 6] Internet-Draft Requirements for High-performance Wide A July 2024 There are three optional methods such as flow-based ECMP, flowlet- based load balance and packet-based load balance. As per [RFC7424], Link Aggregation Group (LAG) and Equal-Cost Multipath (ECMP) are used for bandwidth scaling. ECMP uses 5-tuple for HASH load balancing to achieve per-flow load balancing and link backup and it is applied to scenarios with large number of flows. It will be challenging for HASH conflict and poor network balancing with massive elephant flows. For example, flow-based ECMP will distribute the elephant flows into the same link, resulting in congestion and packet loss. Packet-based load balance will result in out-of-order packets. Flowlet-based load balance can distribute the sub-flows to different paths. The time interval gap value between sub-flows needs be accurately configured based on the delay information of multiple path. The deterministic technology can be implemented to guarantee the latency and jitter. 4.4. Support the Differentiated Traffic Scheduling Traffic scheduling refers to a method for managing and allocating the flow of data packets within a network to optimize performance and utilize network resources efficiently. Considering the multiple services with various types and different traffic requirements, the traffic is required to be scheduled to multiple paths and resources to achieve differentiated QoS requirements. The existing technologies such as resource reservation, network slicing, queuing- based solutions to guarantee deterministic latency can be used for providing zero packet loss, long-distance latency and jitter guarantees and high reliability in WAN. For example, the flow- specific data may require low latency or loose latency and it is required to provide different queuing and scheduling functionalities. 4.5. Support Flow-based Network Monitoring When a fault occurs in data transmission, it should discover root causes of some of the hard-to-debug network and identify the node which is dropping the packets. Bandwidth monitoring is also important for network planning and service-level assurance, which the network operators can predict bandwidth availability and guarantee the high-throughput transmission. The performance monitoring is a critical aspect of managing and optimizing networks such as end-to- end measurement of packet loss, latency, jitter and hop-by-hop node ID, node delay, queue and buffer information. As per [RFC9232], network telemetry is a technology for gaining network insight and facilitating efficient and automated network management.It is required to provide flow-based network monitoring based on telemetry, which makes it easier to troubleshoot issues and monitor bandwidth, traffic and performance. Xiong, et al. Expires 4 January 2025 [Page 7] Internet-Draft Requirements for High-performance Wide A July 2024 5. Security Considerations This document lists the requirements for HP-WAN and does not raise any security concerns or issues in addition to ones common to networking which may have security considerations from both the use- specific perspective and the technology-specific perspective. 6. IANA Considerations This document makes no requests for IANA action. 7. Acknowledgements The authors would like to acknowledge Yao Liu, Zheng Zhang and Bin Tan for their thorough review and very helpful comments. 8. References 8.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition of Explicit Congestion Notification (ECN) to IP", RFC 3168, DOI 10.17487/RFC3168, September 2001, . [RFC7424] Krishnan, R., Yong, L., Ghanwani, A., So, N., and B. Khasnabish, "Mechanisms for Optimizing Link Aggregation Group (LAG) and Equal-Cost Multipath (ECMP) Component Link Utilization in Networks", RFC 7424, DOI 10.17487/RFC7424, January 2015, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [RFC8664] Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W., and J. Hardwick, "Path Computation Element Communication Protocol (PCEP) Extensions for Segment Routing", RFC 8664, DOI 10.17487/RFC8664, December 2019, . Xiong, et al. Expires 4 January 2025 [Page 8] Internet-Draft Requirements for High-performance Wide A July 2024 [RFC9232] Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and A. Wang, "Network Telemetry Framework", RFC 9232, DOI 10.17487/RFC9232, May 2022, . Authors' Addresses Quan Xiong ZTE Corporation China Email: xiong.quan@zte.com.cn Chenqiang Gao ZTE Corporation China Email: gao.chenqiang@zte.com.cn Zhengxin Han China Unicom China Email: hanzx21@chinaunicom.cn Guangyu Zhao China Mobile China Email: zhaoguangyu@chinamobile.com Wenkuan Qu China Telecom China Email: quwk@chinatelecom.cn Xiong, et al. Expires 4 January 2025 [Page 9]