Internet-Draft Switching Efficiency April 2026
Ye, et al. Expires 21 October 2026 [Page]
Workgroup:
IP Performance Measurement
Internet-Draft:
draft-ye-ippm-switching-efficiency-01
Published:
Intended Status:
Informational
Expires:
Authors:
N. Ye
Shanghai Jiao Tong University
W. Sun
Shanghai Jiao Tong University
D. Wang
China Mobile Research Institute
J. Sun
China Mobile Research Institute

Switching Efficiency: A Metric Framework for AI Data Center Networks

Abstract

This document specifies the Switching Efficiency Framework, a measurement methodology designed to evaluate network efficiency in AI Data Centers (AIDCs). Conventional network metrics, such as bandwidth utilization or network throughput, fail to directly link network activity to computational progress, as they cannot distinguish computationally effective data that directly advances neural network computing from the redundant traffic induced by both multi-hop forwarding and the algorithmic overhead of collective operations.

To address this, this document introduces the Switching Efficiency Framework, a novel measurement methodology designed to dissect and evaluate AIDC network efficiency. The core metric, Switching Efficiency, quantifies the computationally effective data throughput delivered per unit of provisioned switching capacity. To facilitate precise diagnostic analysis, the framework further decomposes this core metric into three fine-grained factors: Data Efficiency, Routing Efficiency, and Port Utilization.

This framework provides network operators with standardized quantitative metrics to pinpoint communication bottlenecks and evaluate topology-traffic alignment.

About This Document

This note is to be removed before publishing as an RFC.

Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ye-ippm-switching-efficiency/.

Discussion of this document takes place on the ippm Working Group mailing list (mailto:ippm@ietf.org), which is archived at https://mailarchive.ietf.org/arch/browse/ippm/. Subscribe at https://www.ietf.org/mailman/listinfo/ippm/.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 21 October 2026.

Table of Contents

1. Introduction

In hyperscale AI Data Centers (AIDCs), network communication is frequently the primary performance bottleneck for training Large Language Models (LLMs). While diverse network topologies and communication algorithms (e.g., In-Network Computing) are being deployed, operators lack a standardized, quantitative methodology to evaluate how effectively raw physical switching resources are converted into actual training progress.

Conventional performance metrics, such as bandwidth utilization or network throughput, are inadequate for this environment because they measure absolute network "busyness" rather than useful work. Specifically, they treat all transferred bytes equally, failing to isolate "computationally effective data"—the net data that directly advances neural network computing. For example, during an All-Reduce operation, significant volumes of data are transferred across the fabric only to be discarded after mathematical reduction (algorithmic overhead). Similarly, when the physical topology fails to match the spatial distribution of the workload—such as forcing logically localized, high-volume traffic to cross the broader scale-out fabric—data must traverse an excessive number of forwarding hops (multi-hop overhead). Because traditional metrics conflate these redundancies with effective data delivery, operators cannot accurately quantify how well a specific network architecture aligns with its intended AI traffic patterns.

To bridge this gap, this document defines the Switching Efficiency Framework [SwitchingEfficiencyPaper], which relates the throughput of effective data to the aggregate switching capacity of the network through its core metric, Switching Efficiency ($\eta$). This top-level metric is further decomposed into three diagnostic factors to evaluate specific architectural design choices: Data Efficiency ($\gamma$) tests the communication algorithm, verifying whether it delivers computationally effective data or generates redundant bytes; Routing Efficiency ($\delta$) tests the topology-traffic alignment, revealing whether the physical network provides direct paths or forces traffic into excessive multi-hop detours; and Port Utilization ($\theta$) tests hardware resource allocation, assessing whether the provisioned switching capacity is actively utilized rather than wasted.

By formalizing these metrics, this document equips network operators and telemetry systems with a standardized, mathematically precise toolset to diagnose AIDC network performance, pinpoint communication bottlenecks, and optimize infrastructure design.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Terminology

4. The Switching Efficiency Framework

This section defines the Switching Efficiency Framework. The detailed mathematical derivations supporting this framework are provided in [SwitchingEfficiencyPaper]. For operational measurement, the following metrics are formulated as cumulative volumes over a defined observation window $T$.

4.1. Core Variables

The framework relies on four primary operational metrics collected over the measurement window $T$:

  • $V_{CED}$ (Total CED Volume): The aggregate volume of Computationally Effective Data yielded by all communication primitives completed during $T$.

  • $V_{RECV}$ (Total Received Volume): The aggregate volume of data successfully received by the network interfaces (e.g., NICs) of all compute nodes during $T$.

  • $V_{FWD}$ (Total Forwarded Volume): The aggregate volume of data forwarded by all packet switching ports across the network domain during $T$.

  • $C_{TOTAL}$ (Aggregate Switching Capacity): The sum of the theoretical maximum unidirectional egress data forwarding rates of all packet switching ports, denoted as $\sum R_p$, where $R_p$ represents the theoretical maximum data rate of an individual port $p$.

4.2. Core Metric: Switching Efficiency ($\eta$)

Switching Efficiency ($\eta$) is the top-level metric quantifying how effectively a network translates its raw physical capacity into computational progress. It is defined as the ratio of the CED throughput over observation window $T$ to the aggregate switching capacity of the network.

       V_CED / T
  η = -----------
        C_TOTAL

A high $\eta$ indicates that a large proportion of the network's provisioned hardware capacity is successfully contributing to the delivery of computationally effective data. It serves as a holistic macro-indicator of end-to-end network effectiveness.

4.3. Fine-Grained Efficiency Factors

To enable diagnostic analysis and isolate specific performance bottlenecks, $\eta$ is mathematically decomposed into three independent efficiency factors ($\eta = \gamma \cdot \delta \cdot \theta$):

4.3.1. Data Efficiency ($\gamma$)

Data Efficiency evaluates the effectiveness of implementing the communication primitives. It specifies the ratio of Computationally Effective Data ($V_{CED}$) to the total received volume ($V_{RECV}$).

         V_CED
  γ = -----------
         V_RECV
  • Diagnostic Focus: Identifies data reception redundancy. A value of $\gamma < 1$ indicates that compute endpoints receive unreduced data (e.g., during All-Reduce operations without INC). Executing mathematical reductions within the network data plane via INC resolves this redundancy, driving $\gamma$ to its theoretical maximum of 1.

4.3.2. Routing Efficiency ($\delta$)

Routing Efficiency quantifies the topological alignment between the physical network architecture and the AI Workload Traffic patterns.

        V_RECV
  δ = ---------
         V_FWD
  • Diagnostic Focus: Identifies multi-hop forwarding overhead and potential packet retransmissions. Mathematically, assuming a perfectly lossless network environment, $\delta$ represents the inverse of the volume-weighted average hop count. A value of $\delta < 1$ indicates that traffic either traverses multiple switching ports or experiences network congestion leading to drops and subsequent retransmission overhead.

4.3.3. Port Utilization ($\theta$)

Port Utilization measures the spatial and temporal engagement of the provisioned switching capacity.

           V_FWD
  θ = ---------------
       C_TOTAL * T
  • Diagnostic Focus: Identifies underutilized switching capacity. A low $\theta$ indicates that the provisioned hardware ($C_{TOTAL}$) operates below its theoretical maximum data rate over the observation window $T$, due to either spatial traffic imbalance or temporal idleness.

5. Measurement Methodology

This section specifies the operational procedures for collecting the variables required to compute the efficiency metrics. Accurate measurement requires tight time synchronization (e.g., via the Precision Time Protocol (PTP) [IEEE1588]) across all network and compute endpoints, as well as an observation window ($T$) sufficiently large to dilute telemetry polling variance.

The four core variables span the network, endpoint, and application planes, and are collected as follows:

6. Security Considerations

The operational deployment of this measurement framework raises the following security and privacy considerations:

7. IANA Considerations

This document has no IANA actions.

8. References

8.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3411]
Harrington, D., Presuhn, R., and B. Wijnen, "An Architecture for Describing Simple Network Management Protocol (SNMP) Management Frameworks", STD 62, RFC 3411, DOI 10.17487/RFC3411, , <https://www.rfc-editor.org/rfc/rfc3411>.
[RFC4301]
Kent, S. and K. Seo, "Security Architecture for the Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, , <https://www.rfc-editor.org/rfc/rfc4301>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC8446]
Rescorla, E., "The Transport Layer Security (TLS) Protocol Version 1.3", RFC 8446, DOI 10.17487/RFC8446, , <https://www.rfc-editor.org/rfc/rfc8446>.

8.2. Informative References

[IEEE1588]
"IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems", IEEE Std 1588-2019, .
[SwitchingEfficiencyPaper]
Ye, N., Zhu, J., Chen, B., Wang, D., Sun, J., Sun, W., and W. Hu, "Switching Efficiency: A Novel Framework for Dissecting AI Data Center Network Efficiency", arXiv 2604.14690, DOI 10.48550/arXiv.2604.14690, , <https://doi.org/10.48550/arXiv.2604.14690>.

Acknowledgments

We are grateful to the valuable discussions and inputs from the community. We thank the support from NSFC.

Authors' Addresses

Niangen Ye
Shanghai Jiao Tong University
China
Weiqiang Sun
Shanghai Jiao Tong University
China
Dong Wang
China Mobile Research Institute
Department of Fundamental Network Technology
Beijing
China
Jiang Sun
China Mobile Research Institute
Department of Fundamental Network Technology
Beijing
China