<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?>
<!-- generated by https://github.com/cabo/kramdown-rfc version 1.7.35 (Ruby 3.4.9) -->
<rfc xmlns:xi="http://www.w3.org/2001/XInclude" ipr="trust200902" docName="draft-calabria-bmwg-ai-fabric-inference-bench-01" category="info" consensus="true" submissionType="IETF" updates="2544" tocInclude="true" sortRefs="true" symRefs="true" version="3">
  <!-- xml2rfc v2v3 conversion 3.33.0 -->
  <front>
    <title abbrev="AI Inference Fabric Benchmarking">Benchmarking Methodology for AI Inference Serving Network Fabrics</title>
    <seriesInfo name="Internet-Draft" value="draft-calabria-bmwg-ai-fabric-inference-bench-01"/>
    <author initials="F." surname="Calabria" fullname="Fernando Calabria">
      <organization>Cisco</organization>
      <address>
        <email>fcalabri@cisco.com</email>
      </address>
    </author>
    <author initials="C." surname="Pignataro" fullname="Carlos Pignataro">
      <organization>Blue Fern Consulting</organization>
      <address>
        <email>carlos@bluefern.consulting</email>
      </address>
    </author>
    <author initials="Q." surname="Wu" fullname="Qin Wu">
      <organization>Huawei</organization>
      <address>
        <email>bill.wu@huawei.com</email>
      </address>
    </author>
    <author initials="G." surname="Fioccola" fullname="Giuseppe Fioccola">
      <organization>Huawei</organization>
      <address>
        <email>giuseppe.fioccola@huawei.com</email>
      </address>
    </author>
    <date year="2026" month="February" day="24"/>
    <area>Operations and Management</area>
    <workgroup>Benchmarking Methodology Working Group</workgroup>
    <keyword>benchmarking</keyword>
    <keyword>AI</keyword>
    <keyword>inference</keyword>
    <keyword>LLM</keyword>
    <keyword>network fabric</keyword>
    <keyword>RDMA</keyword>
    <keyword>KV cache</keyword>
    <keyword>MoE</keyword>
    <keyword>disaggregated serving</keyword>
    <abstract>
      <?line 105?>

<t>This document defines benchmarking terminology, methodologies, and Key
Performance Indicators (KPIs) for evaluating Ethernet-based AI inference
serving network fabrics. As Large Language Model (LLM) inference deployments
scale to disaggregated prefill/decode architectures spanning hundreds or
thousands of accelerators (GPUs/XPUs), the interconnect fabric becomes the
critical bottleneck determining Time to First Token (TTFT), Inter-Token
Latency (ITL), and aggregate throughput in tokens per second (TPS). This
document establishes vendor-independent, reproducible test procedures for
benchmarking fabric-level performance under realistic AI inference workloads.</t>
      <t>Coverage includes RDMA-based KV cache transfer between disaggregated prefill
and decode workers, Mixture-of-Experts (MoE) expert parallelism AllToAll
communication, request routing and load balancing for inference serving,
congestion management under bursty inference traffic patterns, and scale/soak
testing. The methodology enables direct, equivalent comparison across
implementations, NIC transport stacks (RoCEv2, UET), and fabric architectures.</t>
      <t>This document is a companion to <xref target="TRAINING-BENCH"/>, which addresses training
workloads.</t>
    </abstract>
    <note removeInRFC="true">
      <name>About This Document</name>
      <t>
        The latest revision of this draft can be found at <eref target="https://fcalabri.github.io/bmwg-ai-fabric-inference-bench/draft-calabria-bmwg-ai-fabric-inference-bench.html"/>.
        Status information for this document may be found at <eref target="https://datatracker.ietf.org/doc/draft-calabria-bmwg-ai-fabric-inference-bench/"/>.
      </t>
      <t>
        Discussion of this document takes place on the
        BMWG Working Group mailing list (<eref target="mailto:bmwg@ietf.org"/>),
        which is archived at <eref target="https://mailarchive.ietf.org/arch/browse/bmwg/"/>.
        Subscribe at <eref target="https://www.ietf.org/mailman/listinfo/bmwg/"/>.
      </t>
      <t>Source for this draft and an issue tracker can be found at
        <eref target="https://github.com/fcalabri/bmwg-ai-fabric-inference-bench"/>.</t>
    </note>
  </front>
  <middle>
    <?line 127?>

<section anchor="introduction">
      <name>Introduction</name>
      <t>Large Language Model (LLM) inference serving has emerged as a dominant consumer
of datacenter network capacity, with fundamentally different fabric requirements
compared to training workloads. While training workloads are characterized by
bulk synchronous collective operations (AllReduce, AllGather) with predictable
periodicity, inference workloads exhibit bursty, latency-sensitive
request/response patterns with strict Service Level Objectives (SLOs) on
per-token latency and time-to-first-token.</t>
      <t>The advent of disaggregated serving architectures, where the computationally
intensive prefill phase (prompt processing) is physically separated from the
memory-bound decode phase (token generation), introduces a new class of
fabric-critical data movement: KV cache transfer. A single large prompt
processed by a typical large-scale model generates multiple gigabytes of KV
cache state that must be transferred from prefill workers to decode workers
within a fraction of the target TTFT SLO.</t>
      <t>As clusters scale with thousands of concurrent requests, this creates sustained
multi-terabyte-per-second aggregate transfer demands on the fabric.
Simultaneously, Mixture-of-Experts (MoE) architectures introduce expert
parallelism (EP), which distributes expert sub-networks across GPUs and requires
AllToAll communication for token-to-expert routing. Wide EP configurations
(e.g., 96-way EP across 12 nodes of 8 GPUs each) generate fine-grained,
latency-sensitive inter-node traffic that contends with KV cache transfers on
shared fabric links.</t>
      <t>This document defines vendor-independent benchmarking methodologies for
evaluating how well a network fabric supports these inference-specific traffic
patterns. All tests are designed for controlled laboratory environments using
either hardware traffic generators or software workload emulators capable of
reproducing inference serving traffic profiles.</t>
      <section anchor="requirements-language">
        <name>Requirements Language</name>
        <t>The key words "<bcp14>MUST</bcp14>", "<bcp14>MUST NOT</bcp14>", "<bcp14>REQUIRED</bcp14>", "<bcp14>SHALL</bcp14>", "<bcp14>SHALL
NOT</bcp14>", "<bcp14>SHOULD</bcp14>", "<bcp14>SHOULD NOT</bcp14>", "<bcp14>RECOMMENDED</bcp14>", "<bcp14>NOT RECOMMENDED</bcp14>",
"<bcp14>MAY</bcp14>", and "<bcp14>OPTIONAL</bcp14>" in this document are to be interpreted as
described in BCP 14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only when, they
appear in all capitals, as shown here.</t>
        <?line -18?>

</section>
      <section anchor="scope-and-applicability">
        <name>Scope and Applicability</name>
        <t>The scope covers Layer 2/3 fabric performance (switch forwarding, link utilization,
 congestion management), RDMA transport performance (one-sided PUT/GET operations
 for KV cache transfer, two-sided SEND/RECV for expert parallelism dispatch), and
the interaction between fabric behavior and application-level inference metrics
 (TTFT, ITL, TPS).</t>
        <t>The DUT boundary for all measurements in this document is defined as the NIC-to-NIC
Ethernet fabric segment — specifically, the path from the point of packet transmission
 by the source NIC Ethernet port to the point of packet reception at the destination NIC
 Ethernet port.</t>
        <t>Intra-node transfer segments (NVLink GPU-to-GPU, PCIe/CXL GPU-to-NIC) are explicitly
OUT OF SCOPE as primary benchmarked entities.  Where intra-node transfer contributes
 measurably to an end-to-end latency measurement (e.g., TTFT decomposition in Section 6.1), implementers <bcp14>MUST</bcp14> report intra-node transfer time as a separately labelled component
 so that the fabric contribution can be isolated.  See Section 3.2 for DUT boundary diagram.</t>
        <t>The document does NOT address benchmarking of individual accelerator (GPU/XPU) compute performance, model accuracy or quality metrics benchmarking of the inference serving
 software stack in isolation from the fabric.</t>
        <t>All methodologies assume controlled laboratory conditions per BMWG convention.</t>
      </section>
      <section anchor="relationship-to-existing-bmwg-work">
        <name>Relationship to Existing BMWG Work</name>
        <t>This document builds upon the foundational BMWG benchmarking framework
established by <xref target="RFC1242"/>, <xref target="RFC2544"/>, <xref target="RFC2889"/>, and <xref target="RFC6349"/>.</t>
        <t>The test structure follows RFC 2544 conventions for trial duration (minimum 60
seconds), statistical repetition (minimum 20 trials for latency, 50 for burst),
and reporting format (graphical and tabular).</t>
        <t>The methodologies extend RFC 2544 Section 26 benchmarks (throughput, latency,
frame loss rate, back-to-back frames, system recovery, reset) to
inference-specific scenarios including KV cache transfer, expert parallelism
dispatch, and disaggregated serving request routing.</t>
      </section>
      <section anchor="relationship-to-companion-documents">
        <name>Relationship to Companion Documents</name>
        <t>This document is a companion to <xref target="TRAINING-BENCH"/>, which defines benchmarking
methodologies for AI training network fabrics. Both documents share common
terminology (Section 2), test topology conventions (Section 3), and reporting
formats (Section 14). Both documents use the terminology defined in
<xref target="TERMINOLOGY"/>, which provides the common terminology base for AI fabric
benchmarking.</t>
        <t>Where training workloads are dominated by bulk synchronous collective
communication (AllReduce, AllGather) with high bandwidth utilization and
periodic synchronization barriers, inference workloads are dominated by bursty,
latency-sensitive point-to-point transfers (KV cache) and fine-grained AllToAll
dispatch (MoE expert parallelism). Implementers deploying converged fabrics that
serve both training and inference workloads <bcp14>SHOULD</bcp14> run both test suites.</t>
      </section>
    </section>
    <section anchor="terminology-and-definitions">
      <name>Terminology and Definitions</name>
      <t>The following terms are used throughout this document. Terms defined in the
companion training document are referenced but not redefined unless the inference
context introduces substantive differences.</t>
      <dl>
        <dt>TTFT:</dt>
        <dd>
          <t>Time to First Token. The elapsed time from receipt of an inference request by
the serving system to emission of the first output token. Includes prompt
processing (prefill), KV cache generation, optional KV cache transfer (in
disaggregated architectures), and initial decode step. Target: &lt; 500 ms for
interactive serving.</t>
        </dd>
        <dt>ITL:</dt>
        <dd>
          <t>Inter-Token Latency. The elapsed time between successive output tokens during
the autoregressive decode phase. Measured at P50, P95, P99, and P99.9
percentiles. Target: &lt; 50 ms P99 for interactive serving.</t>
        </dd>
        <dt>TPS:</dt>
        <dd>
          <t>Tokens Per Second. Aggregate throughput of the serving system measured as the
total number of output tokens generated per second across all concurrent
requests. Reported separately for input (prefill) TPS and output (decode) TPS.</t>
        </dd>
        <dt>KV Cache:</dt>
        <dd>
          <t>Key-Value Cache. The intermediate attention state (key and value projection
matrices) computed during the prefill phase and reused during each decode step.
Size scales with model dimension, number of layers, number of attention heads,
sequence length, and numerical precision. For a 70B parameter model at FP16
with 4K context: approximately 1.34 GB per request.</t>
        </dd>
        <dt>Prefill Phase:</dt>
        <dd>
          <t>The compute-bound phase of inference in which the entire input prompt is
processed in parallel to generate the KV cache and the first output token.
Characterized by high arithmetic intensity (200-400 ops/byte), high GPU
utilization (90-95%), and large activation tensors.</t>
        </dd>
        <dt>Decode Phase:</dt>
        <dd>
          <t>The memory-bound phase of inference in which output tokens are generated
autoregressively, one token per forward pass. Characterized by low arithmetic
intensity (60-80 ops/byte), lower GPU utilization (20-40%), and
memory-bandwidth-limited KV cache reads.</t>
        </dd>
        <dt>Disaggregated Serving:</dt>
        <dd>
          <t>An inference serving architecture in which prefill and decode computations are
executed on physically separate groups of accelerators (workers), connected by
a network fabric. The KV cache generated by prefill workers are transferred
over the fabric to decode workers.</t>
        </dd>
        <dt>xPyD Ratio:</dt>
        <dd>
          <t>The allocation ratio of prefill (x) to decode (y) resources in a disaggregated
serving cluster. For example, 3P9D indicates 3 prefill nodes and 9 decode
nodes. The optimal ratio depends on model size, prompt length distribution,
output length distribution, and SLO targets.</t>
        </dd>
        <dt>EP:</dt>
        <dd>
          <t>Expert Parallelism. A parallelism strategy for Mixture-of-Experts (MoE) models
in which expert sub-networks are distributed across multiple GPUs. Token
routing to the appropriate experts requires AllToAll communication.</t>
        </dd>
        <dt>Wide EP:</dt>
        <dd>
          <t>Expert Parallelism spanning many GPUs (e.g., 96-way EP across 12 nodes),
requiring inter-node AllToAll communication for every MoE layer forward pass.</t>
        </dd>
        <dt>DP Attention:</dt>
        <dd>
          <t>Data Parallelism applied to the attention computation, where the KV cache is
partitioned across data-parallel ranks. Each rank holds 1/DP_SIZE of the KV
cache, and AllToAll communication is used to exchange attention outputs.</t>
        </dd>
        <dt>MoE:</dt>
        <dd>
          <t>Mixture of Experts. A model architecture that activates only a subset of
expert sub-networks for each token, enabling larger model capacity with
sub-linear compute scaling.</t>
        </dd>
        <dt>Normal Dispatch:</dt>
        <dd>
          <t>A communication mode for AllToAll MoE dispatch optimized for the prefill phase.
Maximizes throughput for long input sequences but generates dynamic (symbolic)
shapes incompatible with CUDA Graph.</t>
        </dd>
        <dt>Low-Latency Dispatch:</dt>
        <dd>
          <t>A communication mode for AllToAll MoE dispatch optimized for the decode phase.
Uses fixed input shapes compatible with CUDA Graph, reducing kernel launch
overhead at the cost of slightly lower peak throughput.</t>
        </dd>
        <dt>RDMA:</dt>
        <dd>
          <t>Remote Direct Memory Access. A transport mechanism enabling direct
memory-to-memory data transfer between hosts without CPU involvement.
Implementations include InfiniBand Verbs and RoCEv2 (RDMA over Converged
Ethernet v2).</t>
        </dd>
        <dt>RoCEv2:</dt>
        <dd>
          <t>RDMA over Converged Ethernet version 2. An RDMA transport that encapsulates
InfiniBand transport over UDP/IP, enabling RDMA semantics on standard Ethernet
fabrics.</t>
        </dd>
        <dt>UET:</dt>
        <dd>
          <t>Ultra Ethernet Transport. A transport protocol defined by the Ultra Ethernet
Consortium (UEC) Specification 1.0, offering ordered/unordered reliable
delivery, multipath packet spraying, and integrated congestion control for
AI/HPC workloads.</t>
        </dd>
        <dt>KVCXL:</dt>
        <dd>
          <t>KV Cache Transfer Library. A library providing standardized point-to-point
data transfer primitives (register, transfer, notify) for inference engines,
abstracting underlying transports (intra-node interconnect, RDMA, PCIe, and
storage interfaces). Multiple open-source and vendor implementations exist.</t>
        </dd>
        <dt>GIN:</dt>
        <dd>
          <t>GPU-Initiated Networking. A communication paradigm where GPU threads directly
initiate network operations (RDMA sends, one-sided puts) without CPU
involvement, reducing latency by eliminating CPU-GPU synchronization.</t>
        </dd>
        <dt>PagedAttention:</dt>
        <dd>
          <t>A memory management technique for KV caches that stores attention keys and
values in fixed-size pages (typically, 16-64 KB), enabling non-contiguous
allocation and reducing memory fragmentation.</t>
        </dd>
        <dt>Continuous Batching:</dt>
        <dd>
          <t>A scheduling technique that dynamically adds new requests to an active
inference batch as decode slots become available, improving GPU utilization
compared to static batching.</t>
        </dd>
        <dt>Prefix Caching:</dt>
        <dd>
          <t>Reuse of previously computed KV cache segments for prompts that share a common
prefix (e.g., system prompt), avoiding redundant prefill computation.</t>
        </dd>
        <dt>DUT:</dt>
        <dd>
          <t>Device Under Test. In this document, the DUT is one or more network fabric
elements (switches, NICs, or the complete fabric) whose performance impact on
inference serving is being characterized.</t>
        </dd>
        <dt>SUT:</dt>
        <dd>
          <t>System Under Test. The complete inference serving system including
accelerators, NICs, fabric, and serving software, when end-to-end metrics are
being measured.</t>
        </dd>
        <dt>RT:</dt>
        <dd>
          <t>Router Tester / Traffic Generator. Test equipment capable of generating and
receiving network traffic at specified rates with timestamping accuracy
sufficient for the measurements defined herein.</t>
        </dd>
        <dt>S_KV:</dt>
        <dd>
          <t>S_KV (KV Cache Transfer Size) The total size in bytes of the KV cache state generated by a single
inference request across all transformer layers and all context tokens, computed as:
</t>
          <t>S_KV = 2 x L x H_kv x D x C x P_bytes</t>
          <t>Where:</t>
          <t>L        = number of transformer layers</t>
          <artwork><![CDATA[
  H_kv     = number of KV attention heads per layer
               (H_kv <= H_total for GQA/MQA; see note below)

  D        = per-head key/value dimension (head_dim)
               Typically: head_dim = model_dim / H_total

  C        = context length in tokens (prompt tokens + generated tokens)

  P_bytes  = precision in bytes per element
               (FP16 / BF16 = 2,  FP8 / INT8 = 1)

  Factor 2 = accounts for both the K (key) and V (value) tensors,
             each of shape \[H_kv, D\] per layer per token

  Attention variant mapping for  H_kv:

  MHA (Multi-Head Attention):    H_kv = H_total

  GQA (Grouped-Query Attention): H_kv = H_total / GQA_ratio
                                 (e.g., H_total=64, GQA_ratio=8 -> H_kv=8)

  MQA (Multi-Query Attention):   H_kv = 1
]]></artwork>
          <t>This formula yields the total KV cache bytes for one complete
inference request.  The per-layer, per-token contribution is:</t>
          <t>s_kv_unit = 2 x H_kv x D x P_bytes (bytes per layer per token)</t>
          <t>and S_KV = s_kv_unit x L x C.</t>
          <t>Assumption: all layers share identical H_kv and D values.  Hybrid
architectures (e.g., sliding-window + full-attention layers) <bcp14>MUST</bcp14>
substitute per-layer values and sum across layers.</t>
        </dd>
      </dl>
    </section>
    <section anchor="test-topology-and-architecture">
      <name>Test Topology and Architecture</name>
      <section anchor="reference-fabric-topologies">
        <name>Reference Fabric Topologies</name>
        <t>The reference topologies from the companion training document (2-Tier Clos,
3-Tier Clos, Rail-Optimized) remain applicable. Inference serving introduces
additional topology considerations related to disaggregated prefill/decode
placement and MoE expert distribution.</t>
        <section anchor="topology-a-2-tier-clos-leaf-spine">
          <name>Topology A: 2-Tier Clos (Leaf-Spine)</name>
          <t>Applicable to inference clusters up to approximately 2,048 accelerators. Prefill
and decode worker groups <bcp14>SHOULD</bcp14> be placed on separate leaf switches (or separate
leaf switch groups) to isolate KV cache transfer traffic from decode-to-client
response traffic. Expert parallelism (EP) traffic within a single MoE dispatch
group <bcp14>SHOULD</bcp14> be confined to a single leaf switch or a minimal number of leaf
switches to minimize spine-hop latency.</t>
        </section>
        <section anchor="topology-b-3-tier-clos-leaf-spine-superspine">
          <name>Topology B: 3-Tier Clos (Leaf-Spine-Superspine)</name>
          <t>Required for inference clusters exceeding 2,048 accelerators or for multi-model
serving deployments where different model instances occupy different fabric pods.
KV cache transfer traffic between prefill and decode workers in different pods
traverses the superspine tier, making superspine bandwidth and latency critical.</t>
        </section>
        <section anchor="topology-c-disaggregated-prefilldecode-placement">
          <name>Topology C: Disaggregated Prefill/Decode Placement</name>
          <t>A topology variant specific to inference serving in which prefill workers and
decode workers are placed in distinct physical locations within the fabric,
connected by a dedicated KV cache transfer network segment. This topology enables
independent scaling of prefill and decode resources and allows heterogeneous
hardware (e.g., high-compute GPUs for prefill, high-memory-bandwidth GPUs for
decode).</t>
          <figure anchor="fig-pd-topology">
            <name>Disaggregated Prefill/Decode Inference Topology</name>
            <artwork><![CDATA[
          +----------------------+
          |   Request Router     |
          |   (KV-Aware LB)      |
          +--------+-------------+
                   |
      +------------+--------------+
      |                           |
+-----v-------+         +---------v-----+
| Prefill Pool|         |  Decode Pool  |
| (xP workers)|         |  (yD workers) |
| High Compute|         | High Mem BW   |
| TP=8, DP=N/8|         | TP=8, DP=M/8  |
+------+------+         +-------+-------+
       |                        |
       |  KV Cache RDMA Transfer|
       |  (One-sided PUT/Signal) |
       +------------------------+
]]></artwork>
          </figure>
        </section>
      </section>
      <section anchor="disaggregated-prefilldecode-topology">
        <name>Disaggregated Prefill/Decode Topology</name>
        <t>The disaggregated topology separates the inference pipeline into physically
distinct pools connected by the fabric. The test topology <bcp14>MUST</bcp14> include the
following components:</t>
        <ul spacing="normal">
          <li>
            <t><strong>Prefill Worker Pool:</strong> N Prefill nodes, each containing G accelerators with
high-compute capability. These workers execute the prefill phase and generate
KV cache state. Tensor Parallelism (TP) is applied within each node; Data
Parallelism (DP) is applied across nodes. Each prefill worker communicates
with one or more decode workers via RDMA-based KV cache transfer.</t>
          </li>
          <li>
            <t><strong>Decode Worker Pool:</strong> M Decode nodes, each containing G accelerators with high
memory bandwidth. These workers receive KV cache state from prefill workers and
execute the autoregressive decode phase. DP Attention may partition the KV
cache across DP ranks within the decode pool, requiring AllToAll communication
during decode.</t>
          </li>
          <li>
            <t><strong>KV Cache Transfer Network:</strong> The Ethernet fabric segment connecting prefill and decode worker pools. This segment carries one-sided RDMA PUT operations (or PUT-with-signal) transferring KV cache blocks from prefill GPU memory to decode GPU memory via RDMA over Converged Ethernet (RoCEv2) or Ultra Ethernet Transport (UET).  </t>
            <t>
NOTE ON TRANSFER PATH DECOMPOSITION: The end-to-end transfer from GPU memory to remote GPU memory traverses three segments: (1) GPU-to-NIC: PCIe/CXL (intra-node, out of scope as DUT); (2) NIC-to-NIC: Ethernet fabric (THE DUT — in scope); (3) NIC-to-GPU: PCIe/CXL at destination (intra-node, out of scope as DUT). Benchmarking procedures in Sections 5 and 6 measure fabric-segment latency and throughput exclusively. When end-to-end measurements are reported (e.g., TTFT decomposition), the intra-node segments <bcp14>MUST</bcp14> be labelled separately.</t>
          </li>
        </ul>
        <t>The end-to-end transfer from GPU memory to remote GPU memory traverses three segments:
(1) GPU-to-NIC: PCIe/CXL (intra-node, out of scope as DUT);
(2) NIC-to-NIC: Ethernet fabric (the DUT, in scope);
(3) NIC-to-GPU: PCIe/CXL at destination (intra-node, out of scope as DUT).
Benchmarking procedures in Sections 5 and 6 measure fabric-segment latency and
throughput exclusively.  When end-to-end measurements are reported (e.g., TTFT
decomposition), the intra-node segments <bcp14>MUST</bcp14> be labelled separately.</t>
        <t>~~~~
  GPU Memory --&gt; [PCIe/CXL] --&gt; NIC --&gt; [ETHERNET FABRIC] --&gt; NIC --&gt; [PCIe/CXL] --&gt; GPU Memory
  &lt;---intra-node (out of scope)---&gt;|&lt;------DUT (in scope)-------&gt;|&lt;---intra-node (out of scope)---&gt;
  ~~~~</t>
        <ul spacing="normal">
          <li>
            <t><strong>Request Router:</strong> A network-layer or application-layer load balancer that
assigns incoming inference requests to prefill workers and subsequently routes
KV cache to the appropriate decode workers. KV-aware routing and prefix-aware
caching policies are under test.</t>
          </li>
        </ul>
      </section>
      <section anchor="dut-id">
        <name>Device Under Test (DUT) Identification</name>
        <t>The following table defines the DUT configurations tested in this document:</t>
        <table anchor="tab-dut">
          <name>DUT Configuration Definitions</name>
          <thead>
            <tr>
              <th align="left">DUT ID</th>
              <th align="left">Description</th>
              <th align="left">Components Under Test</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">DUT-S</td>
              <td align="left">Single Switch</td>
              <td align="left">Individual leaf or spine switch forwarding inference traffic. Measures per-hop latency, buffer absorption, ECN marking accuracy.</td>
            </tr>
            <tr>
              <td align="left">DUT-F</td>
              <td align="left">Complete Fabric</td>
              <td align="left">End-to-end fabric from prefill NIC egress to decode NIC ingress. Measures fabric-level KV cache transfer latency, throughput, and congestion behavior.</td>
            </tr>
            <tr>
              <td align="left">DUT-N</td>
              <td align="left">NIC Transport</td>
              <td align="left">NIC RDMA transport stack processing KV cache transfer operations. Measures RDMA verb completion latency, one-sided PUT bandwidth, QP scaling.</td>
            </tr>
            <tr>
              <td align="left">DUT-PD</td>
              <td align="left">Prefill-Decode Path</td>
              <td align="left">The complete data path from prefill GPU memory through NIC, fabric, NIC, to decode GPU memory. Measures end-to-end KV cache transfer including NVLink, PCIe, and fabric segments.</td>
            </tr>
            <tr>
              <td align="left">SUT-E</td>
              <td align="left">End-to-End System</td>
              <td align="left">Complete inference serving system including inference serving software, RDMA transfer libraries, fabric, and accelerators. Measures TTFT, ITL, TPS as functions of fabric performance.</td>
            </tr>
          </tbody>
        </table>
      </section>
      <section anchor="traffic-generator-and-workload-emulator-requirements">
        <name>Traffic Generator and Workload Emulator Requirements</name>
        <t>Tests in this document require one or both of the following traffic generation
modes. The mode used <bcp14>MUST</bcp14> be documented in all test reports.</t>
        <section anchor="hardware-traffic-generator-rt-minimum-requirements">
          <name>Hardware Traffic Generator (RT) - Minimum Requirements</name>
          <t>The hardware traffic generator <bcp14>MUST</bcp14> satisfy all of the following:</t>
          <ul spacing="normal">
            <li>
              <t>RDMA traffic generation supporting RoCEv2 and, where tested, UET transport;
configurable RDMA verb types (one-sided PUT, PUT-with-signal, two-sided
SEND/RECV).</t>
            </li>
            <li>
              <t>Configurable message sizes from 4 KB (minimum KV cache page) to 256 MB
(large KV cache block).</t>
            </li>
            <li>
              <t>Configurable QP counts from 1 QP to a minimum of 256 QPs per
source-destination port pair.</t>
            </li>
          </ul>
        </section>
        <section anchor="software-workload-emulator-we-minimum-requirements">
          <name>Software Workload Emulator (WE) - Minimum Requirements</name>
          <t>A software workload emulator runs on actual accelerators and generates realistic
inference workloads. The WE <bcp14>MUST</bcp14> support all of the following:</t>
          <ul spacing="normal">
            <li>
              <t>Configurable prompt length distributions: uniform, Zipf, and trace-replay
modes.</t>
            </li>
            <li>
              <t>Configurable output length distributions and configurable request arrival
rates: Poisson, bursty, and trace-replay.</t>
            </li>
            <li>
              <t>Disaggregated prefill/decode execution with actual RDMA-based KV cache
transferring between prefill and decode worker pools.</t>
            </li>
            <li>
              <t>MoE expert parallelism with actual AllToAll dispatch where MoE-specific tests
(<xref target="test-cat3"/>) are performed.</t>
            </li>
            <li>
              <t>Measurement instrumentation providing per-request TTFT and ITL with timestamp
accuracy &lt;= 1 millisecond.</t>
            </li>
          </ul>
          <t>When a software workload emulator is used, the complete software configuration
<bcp14>MUST</bcp14> be documented per <xref target="dut-id"/>, as framework version, RDMA library version,
and GPU driver version materially affect results.</t>
        </section>
      </section>
    </section>
    <section anchor="kpi-framework">
      <name>KPI Framework and Metrics Taxonomy</name>
      <t>This section defines the Key Performance Indicators measured across all test
categories. KPIs are organized into four tiers: Primary Latency KPIs
(end-user-facing response time metrics), Primary Throughput KPIs (system-level
capacity metrics), Fabric-Level KPIs (network-specific measurements), and Fabric
Health Indicators (operational monitoring metrics).</t>
      <ul empty="true">
        <li>
          <t>NOTE: Where numerical reference values appear in the Target column of the KPI
tables below (including TTFT, ITL, and other latency targets), these values are
non-normative informational reference points reflecting current industry
observations for interactive inference workloads as of 2025-2026. They do NOT
constitute benchmarking acceptance criteria or performance requirements. Per the
BMWG charter, the definition of acceptance criteria or performance requirements
is explicitly outside the scope of this Working Group. Implementers <bcp14>MAY</bcp14> use
these values as contextual references when interpreting results; they <bcp14>MUST NOT</bcp14>
be used as pass/fail thresholds in vendor evaluations. Deployment-specific SLOs
will vary by application, model architecture, and operator requirements.</t>
        </li>
      </ul>
      <section anchor="primary-latency-kpis">
        <name>Primary Latency KPIs</name>
        <table anchor="tab-latency-kpis">
          <name>Primary Latency KPIs</name>
          <thead>
            <tr>
              <th align="left">KPI</th>
              <th align="left">Unit</th>
              <th align="left">Definition</th>
              <th align="left">Target (Interactive)</th>
              <th align="left">Measurement Point</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">TTFT</td>
              <td align="left">ms</td>
              <td align="left">Time from request arrival to first output token emission</td>
              <td align="left">&lt; 500 ms P99</td>
              <td align="left">SUT-E request/response boundary</td>
            </tr>
            <tr>
              <td align="left">ITL</td>
              <td align="left">ms</td>
              <td align="left">Time between successive output tokens</td>
              <td align="left">&lt; 50 ms P99</td>
              <td align="left">SUT-E token emission timestamps</td>
            </tr>
            <tr>
              <td align="left">TTFT_fabric</td>
              <td align="left">ms</td>
              <td align="left">Fabric contribution to TTFT (KV cache transfer latency)</td>
              <td align="left">&lt; 300 ms P99</td>
              <td align="left">DUT-PD NIC-to-NIC measurement</td>
            </tr>
            <tr>
              <td align="left">ITL_fabric</td>
              <td align="left">ms</td>
              <td align="left">Fabric contribution to ITL (EP dispatch latency per decode step)</td>
              <td align="left">&lt; 5 ms P99</td>
              <td align="left">DUT-F EP dispatch round-trip</td>
            </tr>
            <tr>
              <td align="left">E2E_latency</td>
              <td align="left">ms</td>
              <td align="left">End-to-end request latency from arrival to completion of all output tokens</td>
              <td align="left">Varies by output length</td>
              <td align="left">SUT-E request/response boundary</td>
            </tr>
          </tbody>
        </table>
      </section>
      <section anchor="primary-throughput-kpis">
        <name>Primary Throughput KPIs</name>
        <table anchor="tab-throughput-kpis">
          <name>Primary Throughput KPIs</name>
          <thead>
            <tr>
              <th align="left">KPI</th>
              <th align="left">Unit</th>
              <th align="left">Definition</th>
              <th align="left">Measurement Point</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">TPS_input</td>
              <td align="left">tokens/s</td>
              <td align="left">Aggregate input (prefill) tokens processed per second across all workers</td>
              <td align="left">SUT-E prefill completion events</td>
            </tr>
            <tr>
              <td align="left">TPS_output</td>
              <td align="left">tokens/s</td>
              <td align="left">Aggregate output (decode) tokens generated per second across all workers</td>
              <td align="left">SUT-E token emission events</td>
            </tr>
            <tr>
              <td align="left">TPS_per_GPU</td>
              <td align="left">tokens/s/GPU</td>
              <td align="left">Output tokens per second normalized by number of decode GPUs</td>
              <td align="left">SUT-E per-worker counters</td>
            </tr>
            <tr>
              <td align="left">Goodput</td>
              <td align="left">GB/s or tokens/s</td>
              <td align="left">See the Goodput definition in <xref target="TERMINOLOGY"/><br/>Reports <bcp14>MUST</bcp14> use Inference_Goodput for token-rate measurements and Fabric_Goodput for byte-rate fabric measurements</td>
              <td align="left">SUT-E successful completion events</td>
            </tr>
            <tr>
              <td align="left">KV_BW</td>
              <td align="left">GB/s</td>
              <td align="left">Aggregate KV cache transfer bandwidth between prefill and decode pools</td>
              <td align="left">DUT-PD RDMA counters</td>
            </tr>
            <tr>
              <td align="left">Request_Rate</td>
              <td align="left">req/s</td>
              <td align="left">Maximum sustained request arrival rate meeting all latency SLOs</td>
              <td align="left">SUT-E admission control boundary</td>
            </tr>
          </tbody>
        </table>
      </section>
      <section anchor="fabric-level-kpis">
        <name>Fabric-Level KPIs</name>
        <table anchor="tab-fabric-kpis">
          <name>Fabric-Level KPIs</name>
          <thead>
            <tr>
              <th align="left">KPI</th>
              <th align="left">Unit</th>
              <th align="left">Definition</th>
              <th align="left">DUT</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">KV_xfer_latency</td>
              <td align="left">us</td>
              <td align="left">One-sided RDMA PUT completion time for a single KV cache block transfer</td>
              <td align="left">DUT-N</td>
            </tr>
            <tr>
              <td align="left">KV_xfer_bandwidth</td>
              <td align="left">GB/s</td>
              <td align="left">Sustained unidirectional KV cache transfer throughput per NIC port</td>
              <td align="left">DUT-N</td>
            </tr>
            <tr>
              <td align="left">EP_alltoall_latency</td>
              <td align="left">us</td>
              <td align="left">Round-trip time for a complete MoE expert parallelism AllToAll dispatch</td>
              <td align="left">DUT-F</td>
            </tr>
            <tr>
              <td align="left">EP_alltoall_bandwidth</td>
              <td align="left">GB/s</td>
              <td align="left">Aggregate AllToAll bandwidth across all EP ranks during dispatch</td>
              <td align="left">DUT-F</td>
            </tr>
            <tr>
              <td align="left">Fabric_FCT</td>
              <td align="left">us</td>
              <td align="left">Flow completion time for a KV cache transfer flow through the fabric</td>
              <td align="left">DUT-F</td>
            </tr>
            <tr>
              <td align="left">Buffer_utilization</td>
              <td align="left">%</td>
              <td align="left">Peak switch buffer utilization during KV cache transfer bursts</td>
              <td align="left">DUT-S</td>
            </tr>
            <tr>
              <td align="left">ECN_marking_rate</td>
              <td align="left">%</td>
              <td align="left">Fraction of packets marked with ECN-CE during inference traffic</td>
              <td align="left">DUT-S</td>
            </tr>
            <tr>
              <td align="left">PFC_frame_count</td>
              <td align="left">frames</td>
              <td align="left">Number of PFC PAUSE frames generated per unit time</td>
              <td align="left">DUT-S</td>
            </tr>
            <tr>
              <td align="left">Link_utilization</td>
              <td align="left">%</td>
              <td align="left">Average and peak link utilization on fabric links carrying inference traffic</td>
              <td align="left">DUT-F</td>
            </tr>
            <tr>
              <td align="left">Packet_drop_rate</td>
              <td align="left">ppm</td>
              <td align="left">Packets dropped per million due to buffer overflow or transport error</td>
              <td align="left">DUT-F</td>
            </tr>
          </tbody>
        </table>
      </section>
      <section anchor="fabric-health-indicators">
        <name>Fabric Health Indicators</name>
        <table anchor="tab-health">
          <name>Fabric Health Indicators</name>
          <thead>
            <tr>
              <th align="left">Indicator</th>
              <th align="left">Threshold</th>
              <th align="left">Description</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">CPU Utilization (switch)</td>
              <td align="left">&lt; 30%</td>
              <td align="left">Control plane CPU usage on switches under inference traffic load</td>
            </tr>
            <tr>
              <td align="left">Memory Usage (switch)</td>
              <td align="left">&lt; 70%</td>
              <td align="left">TCAM, buffer, and control plane memory usage</td>
            </tr>
            <tr>
              <td align="left">FEC Error Rate</td>
              <td align="left">&lt; 1e-12 post-FEC BER</td>
              <td align="left">Forward Error Correction effectiveness on fabric links</td>
            </tr>
            <tr>
              <td align="left">CRC Error Count</td>
              <td align="left">0</td>
              <td align="left">Layer 2 CRC errors on any fabric link</td>
            </tr>
            <tr>
              <td align="left">BGP/OSPF Stability</td>
              <td align="left">0 flaps</td>
              <td align="left">Routing protocol adjacency stability under inference load</td>
            </tr>
            <tr>
              <td align="left">NIC QP State</td>
              <td align="left">100% active</td>
              <td align="left">All RDMA Queue Pairs in active state (no error/reset)</td>
            </tr>
            <tr>
              <td align="left">GPU-NIC PCIe BW</td>
              <td align="left">&gt; 90% of theoretical</td>
              <td align="left">PCIe Gen5 x16 bandwidth utilization between GPU and NIC</td>
            </tr>
          </tbody>
        </table>
      </section>
    </section>
    <section anchor="test-category-1-rdma-kv-cache-transfer-benchmarks">
      <name>Test Category 1: RDMA KV Cache Transfer Benchmarks</name>
      <t>KV cache transfer between disaggregated prefill and decode workers is the
defining fabric workload for inference serving. Unlike training collectives
(AllReduce, AllGather) which are periodic and predictable, KV cache transfers
are event-driven (triggered by prefill completion) and bursty.</t>
      <section anchor="point-to-point-kv-cache-transfer-throughput">
        <name>Point-to-Point KV Cache Transfer Throughput</name>
        <t><strong>Objective:</strong> To determine the maximum sustained KV cache transfer throughput
between a single prefill worker NIC and a single decode worker NIC across the
DUT fabric.</t>
        <t><strong>Procedure:</strong> Configure a single RDMA connection (QP) between the prefill and
decode endpoints. Send a sequence of one-sided RDMA PUT operations with message
sizes corresponding to KV cache block sizes. The message size sequence <bcp14>MUST</bcp14>
include: 64 KB (single attention page), 256 KB, 1 MB, 4 MB, 16 MB, 64 MB,
256 MB (large prompt KV cache), and 1 GB. For each message size, transmit at
the maximum rate sustainable by the NIC for a minimum of 60 seconds per trial.
Repeat for 1, 4, 8, 16, 32, 64, and 128 concurrent QPs. The DUT is the fabric
path from NIC to NIC.</t>
        <t><strong>Measurement:</strong> Record throughput (GB/s), CPU utilization on both endpoints,
GPU memory-to-NIC transfer overhead, and NIC hardware offload utilization. The
test <bcp14>MUST</bcp14> be repeated a minimum of 20 times per configuration and the average
reported.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a multi-line graph with
message size (log scale) on the X axis and throughput (GB/s) on the Y axis.
Separate lines for each QP count. A reference line showing theoretical NIC line
rate <bcp14>MUST</bcp14> be included.</t>
      </section>
      <section anchor="kv-cache-transfer-latency">
        <name>KV Cache Transfer Latency</name>
        <t><strong>Objective:</strong> To determine the latency of individual KV cache block transfers
across the DUT fabric under varying load conditions.</t>
        <t><strong>Procedure:</strong> Using the same endpoint configuration as Test 5.1, measure the
completion time of individual RDMA PUT operations. Latency is measured from the
initiation of the PUT verb on the prefill NIC to receipt of the completion
signal on the decode NIC (for PUT-with-signal) or to polling of the remote
completion queue. Measure latency under unloaded conditions (single outstanding
operation) and under loaded conditions (background traffic at 25%, 50%, 75%,
and 90% of fabric capacity). Message sizes <bcp14>MUST</bcp14> include 64 KB, 1 MB, 16 MB,
and 256 MB.</t>
        <t><strong>Measurement:</strong> Report latency at P50, P95, P99, and P99.9 percentiles. The
test <bcp14>MUST</bcp14> be repeated a minimum of 20 trials of at least 120 seconds each per
configuration. The difference between P99 and P50 (tail latency spread) <bcp14>SHOULD</bcp14>
be reported as a derived metric.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a table with columns for
message size, background load level, and latency at each percentile. A
complementary CDF plot of latency distribution for selected configurations
<bcp14>SHOULD</bcp14> be included.</t>
      </section>
      <section anchor="concurrent-kv-cache-transfer-scaling">
        <name>Concurrent KV Cache Transfer Scaling</name>
        <t><strong>Objective:</strong> To characterize how aggregate KV cache transfer performance
scales as the number of concurrent prefill-to-decode transfer pairs increases.</t>
        <t><strong>Procedure:</strong> Configure N concurrent prefill-decode endpoint pairs, where N
ranges from 1 to the maximum supported by the fabric (e.g., 1, 2, 4, 8, 16,
32, 64, 128 pairs). Each pair executes continuous KV cache transfers of 16 MB
messages (representative of a medium-length prompt). Measure aggregate
throughput and per-pair latency as N increases.</t>
        <t><strong>Measurement:</strong> Report aggregate throughput (GB/s), per-pair median latency
(us), per-pair P99 latency (us), Jain Fairness Index across pairs, and maximum
fabric link utilization observed. The test <bcp14>MUST</bcp14> be repeated a minimum of 20
times per value of N.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a dual-axis graph with N
(concurrent pairs) on the X axis, aggregate throughput on the left Y axis, and
P99 latency on the right Y axis. The JFI value for each N <bcp14>SHOULD</bcp14> be annotated.</t>
      </section>
      <section anchor="multi-tier-storage-transfer-characterization">
        <name>Multi-Tier Storage Transfer Characterization</name>
        <t><strong>Objective:</strong> To characterize KV cache transfer performance across the
memory/storage hierarchy: GPU HBM to GPU HBM (inter-node RDMA), GPU HBM to
remote CPU DRAM (offload), CPU DRAM to GPU HBM (reload), and GPU HBM to
NVMe/SSD (persistent cache).</t>
        <t><strong>Procedure:</strong> For each tier pair, measure unidirectional transfer throughput
and latency for message sizes of 1 MB, 16 MB, and 256 MB. Use zero-copy
transfers where supported (GDS for NVMe, GPUDirect RDMA for inter-node).</t>
        <t><strong>Measurement:</strong> Report throughput (GB/s) and latency (P50, P99) for each tier
pair and message size. Report the tier throughput ratio relative to GPU-to-GPU
RDMA as a derived metric.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a table with rows for each
tier pair and columns for throughput and latency at each message size.</t>
      </section>
    </section>
    <section anchor="test-category-2-prefilldecode-disaggregation-benchmarks">
      <name>Test Category 2: Prefill/Decode Disaggregation Benchmarks</name>
      <t>Disaggregated prefill/decode serving separates the two phases onto distinct
hardware pools to enable independent optimization and scaling. This section
benchmarks the fabric's ability to support the resulting KV cache transfer
traffic patterns and their impact on end-to-end inference metrics.</t>
      <section anchor="end-to-end-disaggregated-ttft">
        <name>End-to-End Disaggregated TTFT</name>
        <t><strong>Objective:</strong> To measure TTFT as a function of prompt length in a disaggregated
serving configuration, isolating the fabric contribution.</t>
        <t><strong>Procedure:</strong> Configure a disaggregated serving system (SUT-E) with a specified
xPyD ratio (e.g., 3P9D for a 12-node cluster). Submit inference requests with
prompt lengths of 128, 512, 1024, 2048, 4096, 8192, and 16384 tokens. For each
prompt length, measure the total TTFT and decompose it into: T_prefill (prefill
compute time), T_transfer (KV cache fabric transfer time, measured at DUT-PD),
and T_decode_init (first decode step time).</t>
        <t><strong>Measurement:</strong> Report TTFT (ms) and its decomposition at P50, P95, and P99
percentiles. The ratio T_transfer/TTFT (fabric fraction) <bcp14>SHOULD</bcp14> be reported as
a derived metric. The test <bcp14>MUST</bcp14> be repeated a minimum of 20 trials per prompt
length.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a stacked bar chart with
prompt length on the X axis and TTFT (ms) on the Y axis, with bars decomposed
into T_prefill, T_transfer, and T_decode_init. A table of numerical values <bcp14>MUST</bcp14>
accompany the chart.</t>
      </section>
      <section anchor="xpyd-ratio-optimization">
        <name>xPyD Ratio Optimization</name>
        <t><strong>Objective:</strong> To determine the optimal prefill-to-decode resource ratio for a
given model, prompt distribution, and latency SLO, as limited by fabric transfer
capacity.</t>
        <t><strong>Procedure:</strong> For a fixed total number of nodes N (e.g., 12), iterate over
xPyD ratios: 1P11D, 2P10D, 3P9D, 4P8D, 6P6D, 8P4D, 10P2D, 11P1D. For each
ratio, submit a sustained request stream matching a target request rate with a
specified prompt length distribution (e.g., Zipf with alpha=1.0 over
[128, 8192] tokens). Measure TTFT P99, ITL P99, TPS_output, and Goodput for
each configuration.</t>
        <t><strong>Measurement:</strong> Report all four metrics for each xPyD ratio and request rate.
Identify the Pareto-optimal ratio(s) that maximize TPS_output while meeting
TTFT P99 &lt; 500 ms and ITL P99 &lt; 50 ms.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a multi-panel figure with
one panel per request rate, each showing xPyD ratio on the X axis and metrics
on dual Y axes (TTFT/ITL on left, TPS on right). The Pareto frontier <bcp14>SHOULD</bcp14> be
highlighted.</t>
      </section>
      <section anchor="heterogeneous-parallelism-configuration">
        <name>Heterogeneous Parallelism Configuration</name>
        <t><strong>Objective:</strong> To evaluate the fabric impact of using different parallelism
strategies on prefill vs. decode pools in a disaggregated configuration.</t>
        <t><strong>Procedure:</strong> Test the following parallelism configurations:</t>
        <ul spacing="normal">
          <li>
            <t>Prefill TP=8, Decode TP=8 (baseline, same parallelism)</t>
          </li>
          <li>
            <t>Prefill TP=8, Decode TP=4 with DP_Attention=2 (reduced TP, added DP)</t>
          </li>
          <li>
            <t>Prefill TP=4 with DP=2, Decode TP=2 with DP_Attention=4 (aggressive DP)</t>
          </li>
        </ul>
        <t><strong>Measurement:</strong> Report the number of concurrent RDMA flows, aggregate bandwidth
(GB/s), TTFT (ms), and ITL (ms) at P50 and P99 for each configuration.</t>
      </section>
      <section anchor="prefill-queue-depth-impact-on-transfer-latency">
        <name>Prefill Queue Depth Impact on Transfer Latency</name>
        <t><strong>Objective:</strong> To measure how queuing of prefill requests (due to compute
contention) affects KV cache transfer burstiness and fabric congestion.</t>
        <t><strong>Procedure:</strong> Oversubscribe the prefill pool by submitting requests at a rate
exceeding prefill capacity. Measure the resulting KV cache transfer burst
characteristics: burst size, burst duration, inter-burst gap, and peak fabric
bandwidth demand. Vary the oversubscription ratio from 1.0x (saturated) to 2.0x
in 0.25x increments.</t>
        <t><strong>Measurement:</strong> Report burst size distribution, peak and average fabric
bandwidth, KV transfer latency P99, and ECN/PFC event counts as functions of
oversubscription ratio.</t>
      </section>
    </section>
    <section anchor="test-cat3">
      <name>Test Category 3: MoE Expert Parallelism Benchmarks</name>
      <t>Mixture-of-Experts models distribute expert sub-networks across GPUs and route
tokens to the appropriate experts via AllToAll communication. This section
benchmarks the fabric's ability to support the resulting fine-grained,
latency-sensitive inter-GPU traffic patterns.</t>
      <section anchor="alltoall-dispatch-throughput">
        <name>AllToAll Dispatch Throughput</name>
        <t><strong>Objective:</strong> To determine the maximum AllToAll dispatch throughput for MoE
expert parallelism across the DUT fabric.</t>
        <t><strong>Procedure:</strong> Generate a synthetic MoE dispatch workload where each GPU sends token embeddings to the experts selected by a top-k routing function.
The dispatch payload per GPU per MoE layer is:</t>
        <t>T_dispatch = (B * k * H_model * P_bytes) / N. where B = batch size (tokens), k = top-k routing count,
H_model = hidden dimension, P_bytes = precision bytes (BF16=2), N = EP group size</t>
        <t><strong>Canonical MoE Test Matrix</strong></t>
        <table anchor="tbl-moe-matrix">
          <name>Canonical MoE Test Matrix</name>
          <thead>
            <tr>
              <th align="left">Config</th>
              <th align="left">E (experts)</th>
              <th align="left">k (top-k)</th>
              <th align="left">H_model</th>
              <th align="left">T_dispatch (B=128, BF16, N=96)</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">M1</td>
              <td align="left">8</td>
              <td align="left">2</td>
              <td align="left">4096</td>
              <td align="left">2.1 MB/GPU</td>
            </tr>
            <tr>
              <td align="left">M2</td>
              <td align="left">64</td>
              <td align="left">4</td>
              <td align="left">7168</td>
              <td align="left">29  MB/GPU</td>
            </tr>
            <tr>
              <td align="left">M3</td>
              <td align="left">256</td>
              <td align="left">2</td>
              <td align="left">7168</td>
              <td align="left">14  MB/GPU</td>
            </tr>
            <tr>
              <td align="left">M4</td>
              <td align="left">256</td>
              <td align="left">8</td>
              <td align="left">7168</td>
              <td align="left">58  MB/GPU</td>
            </tr>
            <tr>
              <td align="left">M5</td>
              <td align="left">(implementer-defined — report all parameters)</td>
              <td align="left"> </td>
              <td align="left"> </td>
              <td align="left"> </td>
            </tr>
          </tbody>
        </table>
        <t><strong>Measurement:</strong> Report aggregate bandwidth (GB/s), per-dispatch latency (us)
at P50 and P99, and GPU idle time waiting for dispatch completion. The test <bcp14>MUST</bcp14>
be repeated a minimum of 20 times per configuration.</t>
        <t><strong>Reporting Format:</strong> Results <bcp14>SHOULD</bcp14> be reported as a heatmap with EP group size
on the Y axis, batch size on the X axis, and throughput (GB/s) as the color
dimension. A companion latency table <bcp14>MUST</bcp14> be included. Reports <bcp14>MUST</bcp14> state which config row(s) were used. M5 <bcp14>MUST</bcp14> include E, k, H_model, P_bytes, and N in the results table.</t>
        <t>NOTE: If per-accelerator normalized throughput (BusBW) is reported alongside EP_alltoall_bandwidth, the algo_factor for AllToAll is (n-1)/n where n is the number of EP ranks. See the BusBW definition in <xref target="TERMINOLOGY"/>.</t>
      </section>
      <section anchor="routing-mode-and-dispatch-mode-comparison">
        <name>Routing Mode and Dispatch Mode Comparison.</name>
        <t><strong>Objective:</strong> To compare fabric performance across dispatch modes and routing policies. Tests <bcp14>MUST</bcp14> cover Normal Dispatch and Low-Latency Dispatch.  Tests <bcp14>SHOULD</bcp14> additionally cover at least one alternative routing mode from <xref target="tbl-routing-modes"/>.</t>
        <t><strong>Routing Mode Taxonomy</strong></t>
        <table anchor="tbl-routing-modes">
          <name>MoE Routing Mode Taxonomy</name>
          <thead>
            <tr>
              <th align="left">Mode</th>
              <th align="left">Description</th>
              <th align="left">Traffic Impact</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">Standard Top-k</td>
              <td align="left">Each token routed to k. highest-scoring experts</td>
              <td align="left">Fixed, uniform AllToAll dispatch volume</td>
            </tr>
            <tr>
              <td align="left">Expert Choice (EC)</td>
              <td align="left">Experts select tokens; ensures load balance</td>
              <td align="left">Non-uniform message sizes; tests HOL-blocking resilience</td>
            </tr>
            <tr>
              <td align="left">Top-k with Token Drop</td>
              <td align="left">Overloaded experts drop excess tokens</td>
              <td align="left">Lower peak traffic; unpredictable under load</td>
            </tr>
            <tr>
              <td align="left">Auxiliary Loss Top-k</td>
              <td align="left">Load-balanced top-k via training loss</td>
              <td align="left">Near-uniform AllToAll; lower hot-spot risk</td>
            </tr>
          </tbody>
        </table>
        <t><strong>Measurement:</strong> Measure dispatch latency, fabric bandwidth, and routing mode impact on AllToAll traffic distribution and fabric congestion per <xref target="tbl-routing-modes"/> . Results from different routing modes <bcp14>MUST</bcp14> be reported in separate result tables with the routing mode labelled.</t>
      </section>
      <section anchor="wide-expert-parallelism-scaling">
        <name>Wide Expert Parallelism Scaling</name>
        <t><strong>Objective:</strong> To characterize AllToAll dispatch performance as EP group size
scales beyond a single node (wide EP), requiring inter-node fabric communication.</t>
        <t><strong>Procedure:</strong> Scale the EP group from intra-node only (EP=8) to wide EP (EP=16, 32, 48, 64, 96 spanning 2, 4, 6, 8, 12 nodes). Use a fixed batch size of 128 tokens and at least one configuration from the canonical MoE test matrix <xref target="tbl-moe-matrix"/>.
The selected config row <bcp14>MUST</bcp14> be identified in the results.</t>
        <t><strong>Measurement:</strong> Report total dispatch latency (us), inter-node bandwidth
(GB/s), and latency decomposition (intra-node vs. inter-node fraction). Report
the scaling efficiency: (EP=8 latency) / (EP=N latency) * (N/8).</t>
      </section>
      <section anchor="expert-parallelism-and-kv-cache-transfer-contention">
        <name>Expert Parallelism and KV Cache Transfer Contention</name>
        <t><strong>Objective:</strong> To measure the mutual interference between EP AllToAll dispatch
traffic and KV cache transfer traffic when both share the same fabric links.</t>
        <t><strong>Procedure:</strong> On a shared fabric, simultaneously execute: (a) continuous KV
cache transfers at a sustained rate (e.g., 50%, 75% of fabric capacity), and
(b) periodic EP AllToAll dispatches (one per MoE layer forward pass).</t>
        <t><strong>Measurement:</strong> Report KV_xfer_latency P99 (us) and EP_alltoall_latency P99
(us) for the isolated and contended cases. Report the contention penalty as the
ratio of contended P99 to isolated P99 for each traffic class. Report ECN/PFC
event counts during contention.</t>
      </section>
    </section>
    <section anchor="test-category-4-congestion-management-benchmarks">
      <name>Test Category 4: Congestion Management Benchmarks</name>
      <t>Inference traffic patterns differ from training in their burstiness,
heterogeneity (mixed KV cache transfers and EP dispatches), and latency
sensitivity.</t>
      <section anchor="ecn-marking-under-inference-incast">
        <name>ECN Marking Under Inference Incast</name>
        <t><strong>Objective:</strong> To verify that ECN marking thresholds are correctly applied when
multiple prefill workers simultaneously transfer KV cache blocks to a single
decode worker (incast pattern).</t>
        <t><strong>Procedure:</strong> Configure M prefill workers (M = 2, 4, 8, 16, 32) to
simultaneously transfer 16 MB KV cache blocks to a single decode worker port.
Repeat for ECN marking thresholds of 100 KB, 500 KB, 1 MB, and 5 MB. The DUT
is the individual leaf switch (DUT-S).</t>
        <t><strong>Measurement:</strong> Report the ECN marking rate (fraction of marked packets), the
onset of marking, queue depth at marking onset, and aggregate throughput
achieved. Repeat a minimum of 20 times per configuration.</t>
      </section>
      <section anchor="pfc-behavior-under-bursty-kv-cache-transfers">
        <name>PFC Behavior Under Bursty KV Cache Transfers</name>
        <t><strong>Objective:</strong> To characterize PFC PAUSE frame generation and propagation under
bursty KV cache transfer patterns typical of disaggregated serving.</t>
        <t><strong>Procedure:</strong> Generate KV cache transfer bursts: N_burst concurrent transfers
(N_burst = 4, 8, 16, 32), each of size 16 MB, arriving within a window of
T_arrival (100 us, 1 ms, 10 ms). Vary the PFC threshold from 10 KB to 1 MB.</t>
        <t><strong>Measurement:</strong> Report PFC frame count, total PAUSE duration (us),
head-of-line blocking delay imposed on other traffic classes (us), and KV cache
transfer completion time.</t>
      </section>
      <section anchor="congestion-control-convergence-for-mixed-traffic">
        <name>Congestion Control Convergence for Mixed Traffic</name>
        <t><strong>Objective:</strong> To measure the convergence time of DCQCN (or UET congestion
control) when KV cache transfer traffic and EP AllToAll dispatch traffic share
fabric capacity.</t>
        <t><strong>Procedure:</strong> Establish a sustained KV cache transfer at 80% of fabric
capacity. Introduce EP AllToAll dispatch traffic on the same fabric links.
Measure the convergence time to stable rate allocation. Repeat with the roles
reversed.</t>
        <t><strong>Measurement:</strong> Report convergence time (ms) to within 5% of steady-state
rates, steady-state bandwidth allocation between traffic classes, packet loss
during convergence, and Jain Fairness Index of the steady-state allocation.</t>
      </section>
      <section anchor="pfc-storm-and-deadlock-resilience">
        <name>PFC Storm and Deadlock Resilience</name>
        <t><strong>Objective:</strong> To verify that the fabric does not enter a PFC storm or deadlock
condition under adversarial inference traffic patterns.</t>
        <t><strong>Procedure:</strong> Per the companion training document, generate a PFC storm
scenario by creating circular buffer dependency across multiple switches.
Simultaneously inject KV cache transfer traffic on all affected paths. Monitor
for PFC storm propagation, deadlock, and recovery time. The test duration <bcp14>MUST</bcp14>
be at least 300 seconds.</t>
        <t><strong>Measurement:</strong> Report whether PFC storm occurred (yes/no), deadlock occurred
(yes/no), maximum PAUSE propagation depth (number of hops), maximum
zero-throughput duration (ms), and recovery time (ms).</t>
      </section>
    </section>
    <section anchor="test-category-5-request-routing-and-load-balancing">
      <name>Test Category 5: Request Routing and Load Balancing</name>
      <t>Inference serving introduces application-layer routing decisions that interact
with fabric-layer load balancing (ECMP, flowlet, packet spray).</t>
      <section anchor="kv-aware-request-routing-efficacy">
        <name>KV-Aware Request Routing Efficacy</name>
        <t><strong>Objective:</strong> To measure the effectiveness of KV-aware request routing, where
the request router considers decode worker KV cache memory occupancy and fabric
path congestion when assigning requests.</t>
        <t><strong>Procedure:</strong> Configure a request router with KV-aware routing enabled. Submit
a sustained request stream at rates of 10, 50, 100, and 200 req/s. Compare
against round-robin routing (baseline).</t>
        <t><strong>Measurement:</strong> Report the coefficient of variation (CV) of decode worker
memory utilization, P99 TTFT, P99 ITL, KV cache eviction rate, and Goodput for
both KV-aware and round-robin routing.</t>
      </section>
      <section anchor="prefix-aware-cache-hit-rate">
        <name>Prefix-Aware Cache Hit Rate</name>
        <t><strong>Objective:</strong> To measure the fabric bandwidth savings achieved by prefix-aware
caching, where requests with common prefixes are routed to workers that already
hold the corresponding KV cache segment.</t>
        <t><strong>Procedure:</strong> Generate a request workload where P% of requests share a common
prefix of L tokens (P = 25%, 50%, 75%, 90%; L = 256, 512, 1024, 2048). Compare
against non-prefix-aware routing.</t>
        <t><strong>Measurement:</strong> Report cache hit rate (%), fabric bandwidth reduction (%),
TTFT reduction (ms), and TPS improvement (%) for each (P, L) combination.</t>
      </section>
      <section anchor="ecmp-and-dynamic-load-balancing-under-inference-traffic">
        <name>ECMP and Dynamic Load Balancing Under Inference Traffic</name>
        <t><strong>Objective:</strong> To evaluate fabric-layer load balancing effectiveness under
inference traffic patterns characterized by a mix of large KV cache flows and
small EP dispatch flows.</t>
        <t><strong>Procedure:</strong> Measure link utilization uniformity under: (a) KV cache transfers
only (large flows, 16 MB+), (b) EP AllToAll dispatches only (small flows,
&lt; 1 MB), (c) mixed KV cache and EP traffic.</t>
        <t><strong>Measurement:</strong> Report JFI, maximum link utilization (%), minimum link
utilization (%), and the oversubscription ratio for each scenario and load
balancing algorithm.</t>
      </section>
      <section anchor="jain-fairness-index-for-decode-worker-utilization">
        <name>Jain Fairness Index for Decode Worker Utilization</name>
        <t><strong>Objective:</strong> To measure how evenly the fabric distributes KV cache transfer
load across decode workers.</t>
        <t><strong>Procedure:</strong> With N_D decode workers (N_D = 8, 16, 32, 64), submit a
sustained request stream and measure per-worker KV cache receive rate, GPU
utilization, and output TPS.</t>
        <t><strong>Measurement:</strong> Report JFI for KV cache receive rate, GPU utilization, and
output TPS. Report the max/min ratio for each metric.</t>
      </section>
    </section>
    <section anchor="test-category-6-latency-benchmarks">
      <name>Test Category 6: Latency Benchmarks</name>
      <t>Inference latency is the primary user-facing quality metric. This section
defines benchmarks that isolate the fabric's contribution to end-to-end
inference latency.</t>
      <section anchor="ttft-under-varying-prompt-lengths">
        <name>TTFT Under Varying Prompt Lengths</name>
        <t><strong>Objective:</strong> To characterize TTFT as a function of prompt length, isolating
the fabric-dependent KV cache transfer component.</t>
        <t><strong>Procedure:</strong> Submit single requests (no concurrent load) with prompt lengths
of 128, 256, 512, 1024, 2048, 4096, 8192, and 16384 tokens. Measure TTFT and
decompose into T_prefill, T_transfer, and T_decode_init. As a refernce the following table
is provided.</t>
        <table anchor="tab-conf-matrix">
          <name>Reference Configuration Matrix</name>
          <thead>
            <tr>
              <th align="left">Config ID</th>
              <th align="left">Model Profile</th>
              <th align="left">S_KV @ 4K ctx</th>
              <th align="left">S_KV @ 32K ctx</th>
              <th align="left">S_KV @ 128K ctx</th>
            </tr>
          </thead>
          <tbody>
            <tr>
              <td align="left">CFG-A</td>
              <td align="left">Small: L=32, H_kv=8 (GQA), D=128, BF16</td>
              <td align="left">0.25 GB</td>
              <td align="left">2.0 GB</td>
              <td align="left">8.0 GB</td>
            </tr>
            <tr>
              <td align="left">CFG-B</td>
              <td align="left">Mid: L=80, H_kv=8 (GQA), D=128, BF16 (e.g., Llama-3 70B)</td>
              <td align="left">1.3 GB</td>
              <td align="left">10.5 GB</td>
              <td align="left">42.0 GB</td>
            </tr>
            <tr>
              <td align="left">CFG-C</td>
              <td align="left">Large MHA: L=96, H_kv=64 (MHA), D=128, BF16</td>
              <td align="left">12.3 GB</td>
              <td align="left">98.6 GB</td>
              <td align="left">&gt;300 GB</td>
            </tr>
            <tr>
              <td align="left">CFG-D</td>
              <td align="left">Mid INT8: L=80, H_kv=8 (GQA), D=128, INT8 (quantized)</td>
              <td align="left">0.67 GB</td>
              <td align="left">5.4 GB</td>
              <td align="left">21.5 GB</td>
            </tr>
            <tr>
              <td align="left">CFG-E (custom)</td>
              <td align="left">Implementer-defined:  L=<strong><em>, H_kv=</em></strong>, D=<strong><em>, P=</em></strong></td>
              <td align="left">Computed</td>
              <td align="left">Computed</td>
              <td align="left">Computed</td>
            </tr>
          </tbody>
        </table>
        <t><strong>Measurement:</strong> Report TTFT, T_transfer, and T_transfer/TTFT at P50, P95, P99
for each prompt length. The test <bcp14>MUST</bcp14> be repeated a minimum of 100 times per
prompt length</t>
        <t><strong>Reporting Format:</strong> Results  <bcp14>MUST</bcp14> specify the configuration ID (CFG-A through
 CFG-E) or provide complete values for L, H_kv, D, C, and P_bytes for any test that
specifies KV  cache message sizes. Results <bcp14>SHOULD</bcp14> be reported as a line graph with
prompt length on the X axis and TTFT (ms) on the Y axis, with separate lines for P50
and P99. The T_transfer component <bcp14>SHOULD</bcp14> be shown as a shaded region.</t>
      </section>
      <section anchor="itl-characterization-and-tail-latency">
        <name>ITL Characterization and Tail Latency</name>
        <t><strong>Objective:</strong> To characterize inter-token latency distribution and identify
fabric-induced tail latency during the decode phase.</t>
        <t><strong>Procedure:</strong> Submit a single long-output request (e.g., 2048 output tokens)
and record the timestamp of each emitted token. Repeat under: (a) unloaded
fabric, (b) loaded fabric (50% of capacity), and (c) heavily loaded fabric (90%
of capacity plus concurrent EP dispatches).</t>
        <t><strong>Measurement:</strong> Report ITL at P50, P95, P99, P99.9, and maximum for each load
condition. Report the number of tokens exhibiting ITL &gt; 100 ms (stall events).
The test <bcp14>MUST</bcp14> generate at least 10,000 ITL samples per condition.</t>
      </section>
      <section anchor="end-to-end-latency-under-multi-tenant-load">
        <name>End-to-End Latency Under Multi-Tenant Load</name>
        <t><strong>Objective:</strong> To measure inference latency when multiple models or model
instances share the same fabric.</t>
        <t><strong>Procedure:</strong> Deploy two or more model instances on separate worker pools
sharing the same fabric. Submit requests to both instances concurrently.</t>
        <t><strong>Measurement:</strong> Report per-instance TTFT P99, ITL P99, and the interference
penalty: (multi-tenant metric - single-tenant metric) / single-tenant metric *
100%.</t>
      </section>
      <section anchor="latency-sensitivity-to-fabric-congestion">
        <name>Latency Sensitivity to Fabric Congestion</name>
        <t><strong>Objective:</strong> To establish the relationship between fabric congestion level and
inference latency degradation.</t>
        <t><strong>Procedure:</strong> Inject controlled background traffic on the fabric at levels from
0% to 95% of capacity in 5% increments. At each level, submit inference requests
at a fixed rate and measure TTFT and ITL.</t>
        <t><strong>Measurement:</strong> Report TTFT P99 and ITL P99 as functions of background traffic
level. Identify the inflection point at which latency begins to degrade
significantly. Report the latency degradation factor at 50%, 75%, and 90%
background load.</t>
      </section>
    </section>
    <section anchor="test-category-7-throughput-benchmarks">
      <name>Test Category 7: Throughput Benchmarks</name>
      <t>Inference throughput determines the cost-effectiveness of the serving
deployment.</t>
      <section anchor="aggregate-tokens-per-second">
        <name>Aggregate Tokens Per Second</name>
        <t><strong>Objective:</strong> To determine the maximum sustained aggregate TPS achievable while
meeting latency SLOs.</t>
        <t><strong>Procedure:</strong> Increase the request arrival rate from 1 req/s to the point where
either TTFT P99 exceeds 500 ms or ITL P99 exceeds 50 ms. At each rate, measure
TPS_output, TPS_input, Goodput, and all latency KPIs.</t>
        <t><strong>Measurement:</strong> Report TPS_output, TPS_input, Goodput, TTFT P99, ITL P99, and
fabric utilization at the SLO-bounded throughput. Report the fabric utilization
at the SLO boundary as a key efficiency metric.</t>
      </section>
      <section anchor="batch-size-scaling-and-continuous-batching-impact">
        <name>Batch Size Scaling and Continuous Batching Impact</name>
        <t><strong>Objective:</strong> To measure the interaction between inference batch size,
continuous batching, and fabric transfer patterns.</t>
        <t><strong>Procedure:</strong> Configure the serving system with varying maximum batch sizes
(1, 4, 8, 16, 32, 64, 128). For each batch size, measure: (a) the number of
concurrent KV cache transfers, (b) aggregate fabric bandwidth consumed,
(c) TPS_output, and (d) TTFT P99. Enable continuous batching and repeat.</t>
        <t><strong>Measurement:</strong> Report TPS_output, TTFT P99, fabric bandwidth (GB/s), and peak
concurrent transfers for each batch size, with and without continuous batching.</t>
      </section>
      <section anchor="goodput-under-preemption-and-eviction">
        <name>Goodput Under Preemption and Eviction</name>
        <t><strong>Objective:</strong> To measure the Goodput loss when fabric congestion forces KV
cache eviction or request preemption.</t>
        <t><strong>Procedure:</strong> Oversubscribe the system beyond the SLO-bounded throughput (at
110%, 125%, 150%, and 200% of the rate found in Test 11.1). Measure the rate of
KV cache evictions, request preemptions, and the resulting Goodput reduction.</t>
        <t><strong>Measurement:</strong> Report Goodput, eviction rate (evictions/s), preemption rate
(preemptions/s), wasted fabric bandwidth (GB/s), and the Goodput/TPS_output
ratio (efficiency).</t>
      </section>
    </section>
    <section anchor="test-category-8-scale-and-autoscaling">
      <name>Test Category 8: Scale and Autoscaling</name>
      <t>Inference serving clusters must scale dynamically to match request demand.</t>
      <section anchor="fabric-scale-limits-for-inference-clusters">
        <name>Fabric Scale Limits for Inference Clusters</name>
        <t><strong>Objective:</strong> To determine the maximum inference cluster size supportable by
the DUT fabric while meeting performance requirements.</t>
        <t><strong>Procedure:</strong> Progressively scale the cluster from a minimal configuration
(e.g., 2 nodes, 16 GPUs) to the fabric's capacity (e.g., 1024 nodes, 8192
GPUs). At each scale point (following powers of two), measure KV cache transfer
throughput and latency, EP AllToAll dispatch latency, fabric control plane
convergence time, routing table size, and end-to-end TTFT and TPS.</t>
        <t><strong>Measurement:</strong> Report all KPIs at each scale point. Identify the scale limit
as the point where any KPI degrades by more than 10% from the
minimal-configuration baseline.</t>
      </section>
      <section anchor="dynamic-autoscaling-response-time">
        <name>Dynamic Autoscaling Response Time</name>
        <t><strong>Objective:</strong> To measure the time required for the fabric to accommodate
dynamic scaling of inference worker pools (adding/removing prefill or decode
workers).</t>
        <t><strong>Procedure:</strong> Starting from a stable serving state, trigger a scale-up event
(e.g., adding 4 decode nodes). Measure: (a) fabric convergence time, (b) time
from fabric convergence to first KV cache transfer on new nodes, (c) time to
reach steady-state throughput. Repeat for scale-down events.</t>
        <t><strong>Measurement:</strong> Report fabric convergence time (ms), first-transfer time (ms),
and time to steady-state (ms) for scale-up and scale-down events. Report any
packet loss or latency spikes during the scaling transition.</t>
      </section>
      <section anchor="link-failure-convergence-impact-on-serving">
        <name>Link Failure Convergence Impact on Serving</name>
        <t><strong>Objective:</strong> To measure the impact of fabric link failures on inference
serving performance and the convergence time to restore full service.</t>
        <t><strong>Procedure:</strong> During sustained inference serving at 80% of SLO-bounded
throughput, fail a single fabric link on: (a) a leaf-spine link carrying KV
cache traffic, (b) a spine-spine link, (c) a link on the decode worker's leaf
switch. Measure traffic disruption and recovery time. Repeat for dual link
failures.</t>
        <t><strong>Measurement:</strong> Report traffic disruption duration (ms), convergence time (ms),
TTFT degradation during convergence (ms above baseline P99), TPS reduction
during convergence (%), and time to full recovery (ms). The test <bcp14>MUST</bcp14> be
repeated a minimum of 20 times per failure scenario.</t>
      </section>
    </section>
    <section anchor="test-category-9-soak-and-stability">
      <name>Test Category 9: Soak and Stability</name>
      <t>Long-running inference serving deployments must maintain performance without
degradation over time.</t>
      <section anchor="hour-sustained-inference-load">
        <name>24-Hour Sustained Inference Load</name>
        <t><strong>Objective:</strong> To verify that the fabric maintains performance under continuous
inference serving load for 24 hours.</t>
        <t><strong>Procedure:</strong> Configure the SUT-E at 80% of the SLO-bounded throughput
determined in Test 11.1. Run a continuous request stream for 24 hours with a
realistic prompt length distribution. Sample the following metrics every 15
minutes: TTFT P99, ITL P99, TPS_output, KV_xfer_latency P99, fabric link
utilization, switch CPU/memory usage, NIC counters (RDMA retransmissions, QP
errors), and PFC/ECN event counts.</t>
        <t><strong>Measurement:</strong> Report the trend of all sampled metrics over the 24-hour
period. There <bcp14>SHOULD</bcp14> be zero NIC QP errors, zero routing flaps, and less than
1% variation in TTFT P99 over the test duration.</t>
      </section>
      <section anchor="kv-cache-memory-leak-detection">
        <name>KV Cache Memory Leak Detection</name>
        <t><strong>Objective:</strong> To detect memory leaks in the KV cache management subsystem that
may manifest as fabric performance degradation over time.</t>
        <t><strong>Procedure:</strong> Monitor GPU memory, CPU memory, NIC registered memory regions,
and RDMA memory region counts on all prefill and decode workers during the
24-hour soak test. Record the number of active KV cache pages, RDMA memory
registrations, and pinned memory at each sampling interval.</t>
        <t><strong>Measurement:</strong> Report the trend of each monitored metric. Flag any monotonic
increase as a potential leak. Report the maximum observed memory usage and the
usage at the end of the 24-hour period.</t>
      </section>
      <section anchor="long-running-serving-stability">
        <name>Long-Running Serving Stability</name>
        <t><strong>Objective:</strong> To verify that fabric-dependent components remain stable under
continuous inference serving.</t>
        <t><strong>Procedure:</strong> During the 24-hour soak test, monitor: NIC QP state transitions,
switch buffer utilization trend, FEC error rate trend, BGP/OSPF adjacency
stability, and RDMA retransmission rate. At the 12-hour mark, trigger a
controlled perturbation (single link flap) and verify recovery.</t>
        <t><strong>Measurement:</strong> Report the count of any QP state transitions, maximum switch
buffer utilization, FEC error trend, adjacency flap count, and RDMA
retransmission count. Report the recovery time from the 12-hour link flap
perturbation.</t>
      </section>
    </section>
    <section anchor="reporting">
      <name>Reporting Format</name>
      <t>All test results <bcp14>MUST</bcp14> be reported following the conventions established in RFC
2544 Section 26. In addition, the following inference-specific reporting
requirements apply:</t>
      <ul spacing="normal">
        <li>
          <t><strong>System Configuration Report:</strong> The report <bcp14>MUST</bcp14> include: model name and
parameter count, parallelism strategy (TP, DP, EP, PP configuration for both
prefill and decode pools), xPyD ratio, inference serving framework name and
version, KV cache transfer library name and version, accelerator type and
count, NIC type and firmware version, switch ASIC and software version, fabric
topology, and link speeds.</t>
        </li>
        <li>
          <t><strong>Workload Characterization Report:</strong> The report <bcp14>MUST</bcp14> include: prompt length
distribution (mean, P50, P99, distribution type), output length distribution,
request arrival rate and distribution, number of concurrent requests, and
prefix sharing percentage.</t>
        </li>
        <li>
          <t><strong>Results Reporting:</strong> For each test, results <bcp14>MUST</bcp14> include: the specific test
identifier (e.g., Test 5.1), the DUT/SUT configuration tested, the number of
trials, all measured KPI values with confidence intervals, and any anomalies
observed.</t>
        </li>
      </ul>
      <table anchor="tab-reporting">
        <name>Reporting Format Requirements</name>
        <thead>
          <tr>
            <th align="left">Report Element</th>
            <th align="left">Format</th>
            <th align="left">Required?</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">System Configuration</td>
            <td align="left">Structured table per above</td>
            <td align="left">Yes (required)</td>
          </tr>
          <tr>
            <td align="left">Workload Parameters</td>
            <td align="left">Structured table per above</td>
            <td align="left">Yes (required)</td>
          </tr>
          <tr>
            <td align="left">KPI Summary Table</td>
            <td align="left">Table with all measured KPIs</td>
            <td align="left">Yes (required)</td>
          </tr>
          <tr>
            <td align="left">Latency Distribution Plots</td>
            <td align="left">CDF or histogram per test section</td>
            <td align="left">Recommended</td>
          </tr>
          <tr>
            <td align="left">Throughput vs. Scale Graphs</td>
            <td align="left">Line chart per test section</td>
            <td align="left">Recommended</td>
          </tr>
          <tr>
            <td align="left">Fabric Health Indicators</td>
            <td align="left">Table per Section 4.4</td>
            <td align="left">Yes (required)</td>
          </tr>
          <tr>
            <td align="left">Raw Data Appendix</td>
            <td align="left">Machine-readable format (CSV, JSON)</td>
            <td align="left">Optional</td>
          </tr>
        </tbody>
      </table>
    </section>
    <section anchor="security-considerations">
      <name>Security Considerations</name>
      <t>This document defines benchmarking methodologies for controlled laboratory
testing. All tests <bcp14>MUST</bcp14> be conducted in isolated test environments that are not
connected to production networks or the public Internet. The security
considerations from <xref target="RFC2544"/> and <xref target="RFC6815"/> apply.</t>
      <t>Additionally, implementers <bcp14>SHOULD</bcp14> be aware that RDMA-based KV cache transfer
provides direct memory access between hosts; all RDMA connections in the test
environment <bcp14>MUST</bcp14> use authenticated QPs where supported. The test results
themselves may reveal performance characteristics that could inform
denial-of-service attack vectors; results <bcp14>SHOULD</bcp14> be treated as sensitive when
applicable.</t>
    </section>
    <section anchor="iana-considerations">
      <name>IANA Considerations</name>
      <t>This memo includes no request to IANA.</t>
    </section>
  </middle>
  <back>
    <references anchor="sec-combined-references">
      <name>References</name>
      <references anchor="sec-normative-references">
        <name>Normative References</name>
        <reference anchor="RFC1242">
          <front>
            <title>Benchmarking Terminology for Network Interconnection Devices</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <date month="July" year="1991"/>
            <abstract>
              <t>This memo discusses and defines a number of terms that are used in describing performance benchmarking tests and the results of such tests. This memo provides information for the Internet community. It does not specify an Internet standard.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="1242"/>
          <seriesInfo name="DOI" value="10.17487/RFC1242"/>
        </reference>
        <reference anchor="RFC2544">
          <front>
            <title>Benchmarking Methodology for Network Interconnect Devices</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <author fullname="J. McQuaid" initials="J." surname="McQuaid"/>
            <date month="March" year="1999"/>
            <abstract>
              <t>This document is a republication of RFC 1944 correcting the values for the IP addresses which were assigned to be used as the default addresses for networking test equipment. This memo provides information for the Internet community.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="2544"/>
          <seriesInfo name="DOI" value="10.17487/RFC2544"/>
        </reference>
        <reference anchor="RFC2889">
          <front>
            <title>Benchmarking Methodology for LAN Switching Devices</title>
            <author fullname="R. Mandeville" initials="R." surname="Mandeville"/>
            <author fullname="J. Perser" initials="J." surname="Perser"/>
            <date month="August" year="2000"/>
            <abstract>
              <t>This document is intended to provide methodology for the benchmarking of local area network (LAN) switching devices. This memo provides information for the Internet community.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="2889"/>
          <seriesInfo name="DOI" value="10.17487/RFC2889"/>
        </reference>
        <reference anchor="RFC6349">
          <front>
            <title>Framework for TCP Throughput Testing</title>
            <author fullname="B. Constantine" initials="B." surname="Constantine"/>
            <author fullname="G. Forget" initials="G." surname="Forget"/>
            <author fullname="R. Geib" initials="R." surname="Geib"/>
            <author fullname="R. Schrage" initials="R." surname="Schrage"/>
            <date month="August" year="2011"/>
            <abstract>
              <t>This framework describes a practical methodology for measuring end- to-end TCP Throughput in a managed IP network. The goal is to provide a better indication in regard to user experience. In this framework, TCP and IP parameters are specified to optimize TCP Throughput. This document is not an Internet Standards Track specification; it is published for informational purposes.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="6349"/>
          <seriesInfo name="DOI" value="10.17487/RFC6349"/>
        </reference>
        <reference anchor="TERMINOLOGY">
          <front>
            <title>Benchmarking Terminology for AI Network Fabrics</title>
            <author fullname="Fernando Calabria" initials="F." surname="Calabria">
              <organization>Cisco</organization>
            </author>
            <author fullname="Carlos Pignataro" initials="C." surname="Pignataro">
              <organization>Blue Fern Consulting</organization>
            </author>
            <author fullname="Qin Wu" initials="Q." surname="Wu">
              <organization>Huawei</organization>
            </author>
            <author fullname="Giuseppe Fioccola" initials="G." surname="Fioccola">
              <organization>Huawei</organization>
            </author>
            <date day="21" month="April" year="2026"/>
            <abstract>
              <t>   This document defines benchmarking terminology for evaluating
   Ethernet-based network fabrics used in distributed Artificial
   Intelligence (AI) training and inference workloads.  It provides a
   unified vocabulary consolidating and extending terms from RFC 1242,
   RFC 8238, and the companion AI fabric methodology documents,
   establishing precise, vendor-neutral definitions for collective
   communication primitives, RDMA transport mechanisms (RoCEv2 and Ultra
   Ethernet Transport), congestion control behaviors, AI-specific Key
   Performance Indicators (KPIs), and fabric topology concepts.

   This document is a companion to
   [I-D.calabria-bmwg-ai-fabric-training-bench] and
   [I-D.calabria-bmwg-ai-fabric-inference-bench].  Those documents
   SHOULD NOT be applied without first consulting the terminology
   defined herein.  Where definitions herein overlap with RFC 1242 or
   RFC 8238, the AI fabric context definition in this document takes
   precedence.

              </t>
            </abstract>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-calabria-bmwg-ai-fabric-terminology-01"/>
        </reference>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
        <reference anchor="RFC8174">
          <front>
            <title>Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words</title>
            <author fullname="B. Leiba" initials="B." surname="Leiba"/>
            <date month="May" year="2017"/>
            <abstract>
              <t>RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="8174"/>
          <seriesInfo name="DOI" value="10.17487/RFC8174"/>
        </reference>
      </references>
      <references anchor="sec-informative-references">
        <name>Informative References</name>
        <reference anchor="RFC7432">
          <front>
            <title>BGP MPLS-Based Ethernet VPN</title>
            <author fullname="A. Sajassi" initials="A." role="editor" surname="Sajassi"/>
            <author fullname="R. Aggarwal" initials="R." surname="Aggarwal"/>
            <author fullname="N. Bitar" initials="N." surname="Bitar"/>
            <author fullname="A. Isaac" initials="A." surname="Isaac"/>
            <author fullname="J. Uttaro" initials="J." surname="Uttaro"/>
            <author fullname="J. Drake" initials="J." surname="Drake"/>
            <author fullname="W. Henderickx" initials="W." surname="Henderickx"/>
            <date month="February" year="2015"/>
            <abstract>
              <t>This document describes procedures for BGP MPLS-based Ethernet VPNs (EVPN). The procedures described here meet the requirements specified in RFC 7209 -- "Requirements for Ethernet VPN (EVPN)".</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="7432"/>
          <seriesInfo name="DOI" value="10.17487/RFC7432"/>
        </reference>
        <reference anchor="RFC6815">
          <front>
            <title>Applicability Statement for RFC 2544: Use on Production Networks Considered Harmful</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <author fullname="K. Dubray" initials="K." surname="Dubray"/>
            <author fullname="J. McQuaid" initials="J." surname="McQuaid"/>
            <author fullname="A. Morton" initials="A." surname="Morton"/>
            <date month="November" year="2012"/>
            <abstract>
              <t>The Benchmarking Methodology Working Group (BMWG) has been developing key performance metrics and laboratory test methods since 1990, and continues this work at present. The methods described in RFC 2544 are intended to generate traffic that overloads network device resources in order to assess their capacity. Overload of shared resources would likely be harmful to user traffic performance on a production network, and there are further negative consequences identified with production application of the methods. This memo clarifies the scope of RFC 2544 and other IETF BMWG benchmarking work for isolated test environments only, and it encourages new standards activity for measurement methods applicable outside that scope. This document is not an Internet Standards Track specification; it is published for informational purposes.</t>
            </abstract>
          </front>
          <seriesInfo name="RFC" value="6815"/>
          <seriesInfo name="DOI" value="10.17487/RFC6815"/>
        </reference>
        <reference anchor="TRAINING-BENCH">
          <front>
            <title>Benchmarking Methodology for AI Training Network Fabrics</title>
            <author>
              <organization/>
            </author>
            <date year="2026"/>
          </front>
          <seriesInfo name="Internet-Draft" value="draft-calabria-bmwg-ai-fabric-training-bench"/>
        </reference>
        <reference anchor="UEC-SPEC">
          <front>
            <title>UEC Specification 1.0</title>
            <author>
              <organization>Ultra Ethernet Consortium</organization>
            </author>
            <date year="2024"/>
          </front>
        </reference>
        <reference anchor="VLLM">
          <front>
            <title>Efficient Memory Management for Large Language Model Serving with PagedAttention</title>
            <author initials="W." surname="Kwon">
              <organization/>
            </author>
            <date year="2023"/>
          </front>
          <refcontent>Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles</refcontent>
        </reference>
        <reference anchor="DISTSERVE">
          <front>
            <title>DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving</title>
            <author initials="Y." surname="Zhong">
              <organization/>
            </author>
            <date year="2024"/>
          </front>
          <refcontent>OSDI 2024</refcontent>
        </reference>
        <reference anchor="EP-COMM">
          <front>
            <title>DeepEP: An Efficient Expert-Parallel Communication Library</title>
            <author>
              <organization/>
            </author>
            <date year="2025"/>
          </front>
        </reference>
        <reference anchor="LMCACHE">
          <front>
            <title>LMCache: Hierarchical KV Cache Management for Inference</title>
            <author>
              <organization>LMCache Project</organization>
            </author>
            <date year="2025"/>
          </front>
        </reference>
        <reference anchor="SGLANG">
          <front>
            <title>SGLang: Efficient Execution of Structured Language Model Programs</title>
            <author>
              <organization/>
            </author>
            <date year="2024"/>
          </front>
        </reference>
        <reference anchor="K8S-INF">
          <front>
            <title>llm-d: Kubernetes-Native Distributed LLM Inference</title>
            <author>
              <organization/>
            </author>
            <date year="2025"/>
          </front>
        </reference>
      </references>
    </references>
    <?line 1295?>

<section anchor="kpi-to-test-mapping-summary">
      <name>KPI-to-Test Mapping Summary</name>
      <t>The following table provides a cross-reference from each KPI defined in
<xref target="kpi-framework"/> to the test(s) in which it is measured.</t>
      <table anchor="tab-kpi-mapping">
        <name>KPI-to-Test Mapping</name>
        <thead>
          <tr>
            <th align="left">KPI</th>
            <th align="left">Primary Test(s)</th>
            <th align="left">DUT/SUT</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">TTFT</td>
            <td align="left">6.1, 6.2, 10.1, 10.3</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">ITL</td>
            <td align="left">10.2, 10.3, 10.4</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">TPS_output</td>
            <td align="left">6.2, 11.1, 11.2, 11.3</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">TPS_input</td>
            <td align="left">11.1</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">Goodput</td>
            <td align="left">11.1, 11.3</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">KV_xfer_latency</td>
            <td align="left">5.2, 5.3, 6.1, 6.4</td>
            <td align="left">DUT-N, DUT-PD</td>
          </tr>
          <tr>
            <td align="left">KV_xfer_bandwidth</td>
            <td align="left">5.1, 5.3, 5.4</td>
            <td align="left">DUT-N, DUT-PD</td>
          </tr>
          <tr>
            <td align="left">EP_alltoall_latency</td>
            <td align="left">7.1, 7.2, 7.3, 7.4</td>
            <td align="left">DUT-F</td>
          </tr>
          <tr>
            <td align="left">EP_alltoall_bandwidth</td>
            <td align="left">7.1, 7.3</td>
            <td align="left">DUT-F</td>
          </tr>
          <tr>
            <td align="left">Fabric_FCT</td>
            <td align="left">5.2, 5.3</td>
            <td align="left">DUT-F</td>
          </tr>
          <tr>
            <td align="left">Buffer_utilization</td>
            <td align="left">8.1, 8.2</td>
            <td align="left">DUT-S</td>
          </tr>
          <tr>
            <td align="left">ECN_marking_rate</td>
            <td align="left">8.1, 8.3</td>
            <td align="left">DUT-S</td>
          </tr>
          <tr>
            <td align="left">PFC_frame_count</td>
            <td align="left">8.2, 8.4</td>
            <td align="left">DUT-S</td>
          </tr>
          <tr>
            <td align="left">Link_utilization</td>
            <td align="left">5.3, 9.3, 12.1</td>
            <td align="left">DUT-F</td>
          </tr>
          <tr>
            <td align="left">Packet_drop_rate</td>
            <td align="left">8.1, 8.2, 12.3</td>
            <td align="left">DUT-F</td>
          </tr>
          <tr>
            <td align="left">Request_Rate</td>
            <td align="left">11.1</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">Prefix Cache Hit Rate</td>
            <td align="left">9.2</td>
            <td align="left">SUT-E</td>
          </tr>
          <tr>
            <td align="left">JFI (Decode Worker)</td>
            <td align="left">9.4</td>
            <td align="left">SUT-E</td>
          </tr>
        </tbody>
      </table>
    </section>
    <section anchor="inference-serving-framework-capability-categories-informational">
      <name>Inference Serving Framework Capability Categories (Informational)</name>
      <t>This appendix describes the inference serving framework capability categories
relevant to AI fabric benchmarking. This appendix is intended to guide
documentation of SUT-E configurations and is NOT normative. Implementers using
a Software Workload Emulator (SUT-E tests) <bcp14>SHOULD</bcp14> document which of the
following capabilities their serving framework supports.</t>
      <table anchor="tab-framework-caps">
        <name>Framework Capability Categories</name>
        <thead>
          <tr>
            <th align="left">Capability Category</th>
            <th align="left">Description</th>
            <th align="left">Relevance to Fabric Benchmarking</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">Disaggregated Prefill/Decode (PD)</td>
            <td align="left">Physical separation of prefill and decode execution across different accelerator pools</td>
            <td align="left">Determines whether DUT-PD topology tests apply (Section 6)</td>
          </tr>
          <tr>
            <td align="left">KV Cache Transfer Protocol</td>
            <td align="left">Protocol and library used for prefill-to-decode KV state transfer (one-sided PUT, two-sided SEND/RECV, GPU-initiated)</td>
            <td align="left">Determines RDMA verb types under test and applicable frame formats (Appendix C)</td>
          </tr>
          <tr>
            <td align="left">MoE Expert Parallelism (EP) Support</td>
            <td align="left">Distribution of MoE expert sub-networks across GPUs and AllToAll dispatch mode support</td>
            <td align="left">Determines whether MoE EP tests apply (Section 7)</td>
          </tr>
          <tr>
            <td align="left">Continuous Batching</td>
            <td align="left">Dynamic request admission to active inference batches</td>
            <td align="left">Affects request arrival rate distributions and load balancing tests (Section 9)</td>
          </tr>
          <tr>
            <td align="left">Prefix / KV Cache Sharing</td>
            <td align="left">Reuse of KV cache segments for requests with common prefixes</td>
            <td align="left">Determines applicability of prefix cache hit rate test (Section 9.2)</td>
          </tr>
          <tr>
            <td align="left">RDMA Transport Support</td>
            <td align="left">Underlying transport(s) supported: RoCEv2, UET, or other</td>
            <td align="left">Must be documented; affects congestion management test interpretation (Section 8)</td>
          </tr>
          <tr>
            <td align="left">GPU-Initiated Networking (GIN) Support</td>
            <td align="left">Ability for GPU threads to directly initiate RDMA operations without CPU involvement</td>
            <td align="left">Affects RDMA primitive choice in MoE dispatch tests (Section 7)</td>
          </tr>
          <tr>
            <td align="left">Kubernetes / Orchestration Integration</td>
            <td align="left">Native support for container-based deployment and horizontal scaling</td>
            <td align="left">Relevant for autoscaling tests (Section 12.2)</td>
          </tr>
          <tr>
            <td align="left">Maximum Reported Scale</td>
            <td align="left">Maximum cluster scale at which the framework has been validated</td>
            <td align="left">Documents applicability of fabric scale tests</td>
          </tr>
        </tbody>
      </table>
      <t>NOTE: Implementers <bcp14>MUST</bcp14> document the specific framework name, version, and
configuration in all test reports. Results obtained with different frameworks
are not directly comparable; framework identity is a required reporting
parameter per <xref target="reporting"/>.</t>
    </section>
    <section anchor="kv-cache-transfer-frame-format">
      <name>KV Cache Transfer Frame Format</name>
      <t>This appendix defines the reference frame formats for KV cache transfer
benchmarking over RoCEv2. The frame format follows the standard RoCEv2
encapsulation with one-sided RDMA WRITE (PUT) operations.</t>
      <figure anchor="fig-roce-frame">
        <name>RoCEv2 KV Cache Transfer Frame (One-Sided RDMA WRITE)</name>
        <artwork><![CDATA[
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |             Destination MAC Address (bytes 0-3)               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Destination MAC (bytes 4-5)  |  Source MAC Address (0-1)     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                 Source MAC Address (bytes 2-5)                |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |   EtherType = 0x0800/0x86DD   |  DSCP  |ECN|      ...        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                IPv4/IPv6 Header (20 or 40 bytes)              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Src Port (entropy)      |  Dst Port = 4791 (RoCEv2)    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          UDP Length           |          UDP Checksum         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | OpCode=RDMA_WRITE(0x0A) |SE|M| Pad |TVer|       PKey        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         Destination QP Number (24 bits)         |A| Reserved |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |             Packet Sequence Number (PSN, 24 bits)             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   |                 RETH: Virtual Address (64 bits)               |
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                   Remote Key (R_Key, 32 bits)                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     DMA Length (32 bits)                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   |           KV Cache Payload (variable, up to MTU)              |
   |            (key/value attention state data)                   |
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        ICRC (4 bytes)                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
      </figure>
      <t>Notes: The UDP Source Port <bcp14>SHOULD</bcp14> use entropy-based values for ECMP load
distribution. The RETH (RDMA Extended Transport Header) carries the remote
virtual address, remote key, and DMA length for the one-sided WRITE operation.
For KV cache transfers, the DMA Length field indicates the size of the KV cache
block being transferred. Typical MTU for RoCEv2 is 4096 bytes; larger KV cache
blocks (e.g., 64 KB pages) are segmented into multiple packets by the NIC. For
PUT-with-signal operations, the last packet in the transfer includes an RDMA
WRITE with Immediate Data (OpCode 0x0B) to signal completion to the decode
worker.</t>
    </section>
    <section anchor="moe-alltoall-communication-pattern">
      <name>MoE AllToAll Communication Pattern</name>
      <t>This appendix describes the AllToAll communication pattern used for MoE expert
parallelism dispatch and its fabric-level traffic characteristics. In a
Mixture-of-Experts model with M total experts distributed across N GPUs (each
GPU holds M/N experts), a single MoE layer forward pass generates an AllToAll
communication pattern where each GPU sends a variable-size payload to every
other GPU.</t>
      <table anchor="tab-moe-dispatch">
        <name>MoE Dispatch Traffic Characteristics by Mode</name>
        <thead>
          <tr>
            <th align="left">Parameter</th>
            <th align="left">Normal Dispatch (Prefill)</th>
            <th align="left">Low-Latency Dispatch (Decode)</th>
          </tr>
        </thead>
        <tbody>
          <tr>
            <td align="left">Batch Size</td>
            <td align="left">128 - 512 tokens</td>
            <td align="left">1 - 16 tokens</td>
          </tr>
          <tr>
            <td align="left">Payload per GPU pair</td>
            <td align="left">Variable (depends on routing)</td>
            <td align="left">Fixed (padded to max)</td>
          </tr>
          <tr>
            <td align="left">Shape Compatibility</td>
            <td align="left">Dynamic (symbolic)</td>
            <td align="left">Static (CUDA Graph)</td>
          </tr>
          <tr>
            <td align="left">QP Parallelism</td>
            <td align="left">24 QPs per connection</td>
            <td align="left">8 - 16 QPs per connection</td>
          </tr>
          <tr>
            <td align="left">RDMA Primitive</td>
            <td align="left">Two-sided SEND/RECV or one-sided PUT</td>
            <td align="left">One-sided PUT (GPU-direct RDMA, GIN)</td>
          </tr>
          <tr>
            <td align="left">GPU Initiation</td>
            <td align="left">CPU-initiated or GIN</td>
            <td align="left">GIN (device-initiated, GPU-to-NIC direct)</td>
          </tr>
          <tr>
            <td align="left">Typical per-dispatch size</td>
            <td align="left">1 - 10 MB aggregate</td>
            <td align="left">10 KB - 1 MB aggregate</td>
          </tr>
          <tr>
            <td align="left">Dispatch Frequency</td>
            <td align="left">Once per MoE layer (prefill)</td>
            <td align="left">Once per MoE layer per token (decode)</td>
          </tr>
          <tr>
            <td align="left">Latency Target</td>
            <td align="left">&lt; 1 ms per dispatch</td>
            <td align="left">&lt; 200 us per dispatch</td>
          </tr>
        </tbody>
      </table>
      <t>For a representative dense MoE configuration (M3: E=256, k=2, H_model=7168, EP=96 across 12 nodes, BF16), the inter-node traffic per MoE layer dispatch using T_dispatch = (B * k * H_model * 2) / N is approximately</t>
      <ul spacing="normal">
        <li>
          <t>Normal Dispatch (prefill, batch=256): 256 * 2 * 7168 * 2 bytes / 96 GPUs
= ~76 KB per GPU pair, ~870 MB aggregate across all pairs.</t>
        </li>
        <li>
          <t>Low-Latency Dispatch (decode, batch=8): 8 * 2 * 7168 * 2 bytes / 96 GPUs
= ~2.4 KB per GPU pair, ~27 MB aggregate.</t>
        </li>
      </ul>
      <t>With 61 MoE layers and a decode iteration time target of ~30 ms, the decode
phase requires 61 AllToAll dispatches within 30 ms, yielding ~2,000 dispatches
per second per decode step, consuming approximately 54 GB/s aggregate inter-node
bandwidth for the Low-Latency Dispatch path.</t>
    </section>
    <section anchor="model-architecture-parameters">
      <name>Model Architecture Parameters</name>
      <t>This appendix provides a sample calculation for the S_KV formula already provided.
It's based on a '70B parameter model at FP16 with 4K context' model</t>
      <artwork><![CDATA[
Parameter                      Symbol   Value   Source
Transformer layers             L        80      Published architecture
KV attention heads (GQA-8)     H_kv     8       H_total=64 / GQA_ratio=8
Per-head dimension             D        128     model_dim(8192) / H_total(64)
Context length                 C        4,096   Given
Precision                      P_bytes  2       FP16 = 2 bytes/element

Step-by-Step Calculation

S_KV = 2  ×  L  ×  H_kv  ×   D   ×    C    × P_bytes

 = 2  ×  80  ×   8   ×  128  ×  4,096  ×    2

Step 1:  2  × 80           =         160   (K + V tensors × layers)

Step 2:  160 × 8           =       1,280   (× KV heads)

Step 3:  1,280 × 128       =     163,840   (× head dimension)

Step 4:  163,840 × 4,096   = 671,088,640   (× context tokens)

Step 5:  671,088,640 × 2   = 1,342,177,280 bytes
]]></artwork>
    </section>
    <section numbered="false" anchor="acknowledgements">
      <name>Acknowledgements</name>
      <t>Contributions and review are solicited from the BMWG mailing list
(bmwg@ietf.org) and the broader AI networking community. The BMWG chairs and
Area Director are identified at https://datatracker.ietf.org/group/bmwg/about/.</t>
    </section>
  </back>
  <!-- ##markdown-source:
H4sIAAAAAAAAA9W963IbR5Y/+L2eItcdHQO4AfAiiqLolqdJkJQ4EimYpOTp
mZ5gFIEiWSMAhUEBlNCmHf+H2Ij9us+y+yb7JHt+55y8VRVJ2d3eiJXDJFio
yso8efLcL91uN1nki3G2a77Zz6bD20k6/5RPb8xJtrgtRsW4uFmZ62Ju9o7N
8fQ6m9M9mTnP5ne46TRbfC7mn8xRejXPh+U3SXp1Nc/uaKzodvnahON/kyxn
o3SRlbtm8/nWVjKkzzfFfLVr8ul1kSSjYjhNJzSr0Ty9XnSH6RhjpN2ryeeb
bpp3r3nIbm7f0b3C4N31jaRcXk3yssyL6WI1owGODy+OkulycpXNdxO8kt64
vrndXd/sbtJ7i2mZTcslzWMxX2YJzf1Zks6zlNbwfpbN0wUNVJp0OjIn6TS9
ySbZdPFNglXfzIvl7DGw/VjItde48ZvkU7aix0a7iemaq+AZ/L13jJ9uNfjj
3bsT/JoqiGXBuHJ2cLKH328/mmE6vOWbT4pD/BrlZXpzM89uaJkjU8ouJcld
Nl1m9F7jpnzy4+tv6G+B0DeViRozSfMxXQew/5Jni+teMb/B9XQ+vKXrt4vF
rNxdW8NtuJTfZT172xourF3Ni89ltoYB1vDgTb64XV7Ro9e6k2uPbySeGQM9
FsHr7LM9Ga2XF0+MsvarkKd3u5iMv0mSdElbOOdtul6Ox4KGR9l8SkhQmL4O
RhOk/coXeTom3Dnq4e9yOZe7w5sIJuk0/zsjEn2Tl8MClzOBsV3TX4b4ojcs
JvF7++l8XJRmkN9M00U6L+LX9uPXRnfF790fLzNehOkTPi/HC0Y8N40hv+Yv
V3QXwWTaGwY3hdP5IZ+aH5fxJH6IJyFfx29/s0w/Z3nwvqt8PO59Xv7llr+o
L/t1viyz2YymnBfDYTGugPt1/MrwpidefKMD9671mXAKybSYT+jJOz4sZ0f9
jc2tTf0IKmU/7uy81I/bz7b448Xh2cnx6ft371//lShO96D3EM4tsvkknzJ9
SBKQuviFL7ae2Rdu72w856HP9o5Pj09fd/cPT/tvcIXO7VdS7It5SkBrItQY
xRND/pPIRZ6VmJO8xBAFp+kSAeoe4Bw9RYsX+jI5TTTEh8N+93xw2I/nTFfN
+Swb5tf5kHfJbPTWZULu5PE/2kncPqZxzeHilmfC2FvMF/lyUlnDFv35kWhm
/LLDa3pLTiSboDMh9hIQcYbRu3R+k9HP6c2SLhMZHWVjx90+E5kxA7o+2lss
6Amaa9M8QbcJJ3/smbefi2k8q2f85zy7pgOFIWhKg3kxzLIRvaA0xbWhhZm9
/ok5P379fnBuNl/SO89Xk1lR0hoNQUe5EM3nfFUusgkRg3k+Heazccb7eHB8
fnF+ePbxMF75QV4usBD6fOC5AoYZ0HTo+DFPO8iGBabCwHhdFKPZctEtZot8
kv+dOMhj4HkEFH/tmf+4LYTAxDtUgcX784Nj/gqDHQ66/fcnlQ08yLLZ4WDX
7E2N38vDLwSTRXeQztPxmGbULyaT5dSi07v8ap7OVxX8eE5/vjvp7/XfVABF
F8FFiVbkBGhwM0JwMFe+XEUYJ9Q8jLE6IMG5+O9suGiYx/nrd3unr+Np0DUC
8260ymy45BURmpyTdDJcLOe8KdF20Gtu5umkeqYB7bc7593j06P4RePxpDva
NW+XV3yisrJ7yhQIaLKY51dLiA50kKpLDVaQ9Hq9JOl2uya9omfS4SJJLm7z
0pDUtmRQjQjFplkZyTkmoH0dM3HUiohOh5HxbbZKBtmcaSLExuPpCHtazEvT
ejs4Ltu8A9ldOl4KJluq0L1KS5o0ETwvQanwU5Gfyp7ZK5vRukVrbvsBaAmz
cbHCasqkJJTIzKKoCFgzOUlrI5yijKWjfJHxNpWmnKVTpr23y+mI9o1O+zyh
NS9LWisf/XQ4zMY43LzC14MP5dq/0492h4lCDuJLR2VKA+rsCZzEpmhs+j4Z
zokbAlWvigXtLN32ieYsMMZrL/IJT/kon5cLc1F8yqamdXFxdEHjM2Hv8rXk
HS1lOlyZ1vHFu7ZshFsivYhEwptbogo0HxqNHigNHT5iFjS1EQ04OG/3DDY/
cZtPUlt6Nc7LW5opiZ6jYk6iFoGTPtLXHSIBs3kxWg7zKwCV7iZAEk0cMdho
i5MIa5S7jLM72qRZgB4EVpoIier0KoJEtP0Gez4u0lFJiNov7gjKNwDpcLwc
0UsgQivSWCmaxP90WtLjBOTF54yA1bjXCeCj+413ZHNC3pP8C/a8W1x3hTTR
dpJE3jYZ/2VmSqjycmL2xuOLgn6Q5hEQLQDlf5YABcGbcRvvwQLMFbFbIvdK
o/0KFcE7UGFu6EkQioknVgKdqyVt/ip4ilYJAkNTWoC368lj/F4ri/RTgv2g
YbGnWXBIVyab0qYS7Eb5nDCyY2i+OR1FvIuWQkvMS5pBOpwXZZnkE2JPmIfo
Tx1zetwXCM+IextCkOEnAtJZ0T+82+yQpHChqKeIHh2lXpW60OdUXjrFqgnJ
f/opFpN+/rljPhMlvzXpiA5fWeLQqHyShLgBIjbJR6NxliR/wLlgzMSkk+Sr
yISlM7dpSTJmRk/Q+cH8RgUdxJShQ7I0fZPQkScimg4znD5HmIbpLB3mCyKJ
LG9c08alDLnxeEXAvuYXOQoANKENELIkcKcXEgTs6gLMNz/e5uOs4RsCb2aG
tykoNwl94PRXq+RqOf5kyhUdvXkxJTJFEyekHTJvKLwq3CLkPaPDOsw6QObX
KYhwWyZPh4RINk5/ltATOckWsrSGg0mH4za/yheKox3W94gQdaGP53hromdi
jTZwBj3dYa28DeyKaCPLIzTyO6YQ76/+W+ZMMz1/956YBm0lzaXLxMu+hHGN
pJyMLnevQSHle0Y1ouSjO8AcG9akUcfoCUyjxTHRxo4sBemxfwmIOK2GIKj0
w8wIUTLTIoo3mSnhK0novmkDq2e3qxI0nXaetBTaH7z0mm5lij9hEbZ7VSw9
DdLhZHE3xAVkm9oAueByBmScZp/NcJyW4DuJUlTHQICUZkJEcsJSWY0iEs80
mCLh0piPhEw+0ckz9tA7FqsZD8f3dIVlTvjM6LxoJhMolUQaSA+7Sa9WuERA
fvsxkTcSXWCmky7oTiKGV34ScwsJC0ilvsyVI3qcADmIW6V0fzq0AhR2Z4GZ
ES8kJmgIN2izSRgglkBiNQ0kM2bMitg0nd/hcs6HUBGyBIem7RoS88ESShqB
zlg2Snh9UPJ4cV3gnXLKgKdaRjMinZTfMeXZybb0kvMco6TTjOYwXj3CXWJ5
w+23cp0k5Dqtw0HbUsSRk/NKy6DK5VVX6VGpJNxAHuFjoiSnTCzrMhHrYrbE
+IezpAMqFyMSlNPGHA4AxOv8ZqlEJGllvZtex7zc7n5OV/heX7qxaabFSLBi
R6aQEWa0HQoZyJXdmzmDu5PUaIaITV0M4ngd45MoHSMlHTUcxy4k5S1TUyW1
43z6qc57rGRbF2xiYTeSb1mqCQTX2+Kz+ZxBB6vIp7QTM3BIFvHKzNPNbqkq
s11VYmlhD1SYxSih6wS9/GaKZdC+YNVzkHESJtKrgmVNsPK7nGg8MxGzxNFO
shxknLjYfPQZo1jYKdwhodJwZXG94K8tFSeWtxzL1+BjEOiIwDgBj5Za55RO
BJkXdJCZvf/hD+YsYGyO5wo1/pSt8ELau29OPpxffNOR3+b0PX8+O/zhw/HZ
4QE+n7/Ze/fOfUj0jvM37z+8O/Cf/JNQOA9PD+RhumqiS8k3J3t//UYEk2/e
Dy6O35/uvfuGJeEIJxhgBagVYx9RqAVLAQntBVHZK/qDntnvD/6v/3Nji2SV
/w1GpI2Nlz//rH/sbLzYoj+Ij0zlbcWUWID8SfuyStLZLEsh/ZkU5y+d5SQf
QHgj4kPINDXgQATIb/8TkPmvXfPnq+FsY+t7vYAFRxctzKKLDLP6ldrDAsSG
Sw2vcdCMrlcgHc9376/R3xbuwcU//yudzcx0N3b+9fuEsed8SBIKQ25vNhsT
ZbrKxyR4CP6U/OUQWgBQa0V4vrn2zJ64UKVolUQdiETSBUJzGEY6TAYMUbOx
GhU7iWmUuIm+Qq0IpNxo5ILIVknEcGQGHy7WXh9eBDJVwme1RpNo7z8X+tA5
AWqNQPZRdOC6XkFUnSgCkUpGoMRpkMoBrVLjFMnb9C6nkVjdE5DhPlWz/KEl
Mga1ORHVkTTHi3cdw0qfwPbgw4VhgSSdi9UR+DnJ0nJpz3LttOAzE1EWlDFR
Ug7AO+hX4gx9lh5mN/zQ//O//ndjSSDkI9GRacW3Tj4ysyIXoY0E6k80BMNR
fUEJZBTcVBbL+ZBf6Y2KvFuQoRsGIWUnmzEIiYnghhErSML6MON4GAILlIjU
cSBh9boM4tynH98BoYi1Ycn0q2MG/eNsrf/v7+xFGrXNNIW2eQwBmgTJ9wTn
90fmvP9+cAiwzeb5BCB3PIegCRPlgrhNz5DkD4k0b5gJcwTh/oluFJHtFZaf
TmmIEXNxKJ8qKQebaZRtswAFoYvNlAwJ2uXzTFBtu7cB8dNqgDh0TIOIKwDO
TZOCJC5akxV7aUbEsDLmXPyeKQ2V0O4JN/fykl8Q3j1Mp0yGywLTHxEkzrPM
zexZb5NxNELaUZ7Cfqb47Pl8QWwbdEoVyJi7E34Q58/v8tGSJN7AkMN2HJhx
2qoLZCEZ6KhATA8Q2Am49MD/0AhEquxJq71HTnKFiSaeFbMuDfjLolkisyfC
ipQQ3SoSCekCtNAHBARIrLmoezD2wF2Ia3diBLcMW95W3uYzYM/hl5wtB3I3
XIpV0elqmY+JiS9nVuDlLRBdSZ6KDT+0KxkEjcQblFjV+Okndc1Ay+c/4Jzx
f+zsvMQfIG18AU6an3/WDWZ7U2mtqTSH8bj4XMLhwo7oYJmlSLbzHCqSSq6m
BevaZDkx2+uJSPaw10FvYSMU3Uponi3y+ObNdRlHhtST1THP1/lvVoDbnUSE
bRwStfZMCNNbhJwzsUmzzpqSmp7OLfmN9zT7AhHXr8Xi/ea2hyyRIG/Uc0p3
J2FomzGEcJy/jrkitAItwG/ZC5I3SnZBgCiCm65gtiqzRZv2P2kQVcthNk3n
eVGq5Q3LauBydW6WWG4mu9ishFcsZs1Y2XdWogNFw/IfsSY12bWTmqgPU6Sz
utQM0PsFMS37dshvbI0hlYq4VGAiNy23fbAIY6WLYiZfhUjqbnumJjSHQomg
UHDLxla79v5lKaaL8NWWP+fThGDhfZseECS8E/ETU7ROPhoBxlULCg1dCGFG
myUs6gHjlBjPFnLeHzFMxVbUR21Tt/nNLU1rOvqcj+jPQKJjgcmaq9yL7JdX
6Xyes5m3yYjVMFm2ZjVopixZ4ESJiOE1z5Y9E20xgQYarjcX2xPBun/DkaGN
PQ5ZrrgvAFlGFrZLKgoyC2UHSQbnwa3fBLy+aZUq3s+XU32AiegyX4j2Zi6C
nRe34jU76yHbMp0SMmudQAK2JSxHSovoCMdSYo/HLANMFKeHP6Z2zpEWNs90
8rQXNOS0gABnx1hOx2DkEUNN2C7wZRGay8rlFRH0KW+aNcAOxQxNgs9ustvk
WxGDOVGfGS8LNzAbhvyYz1iiTKcBcC31ulohDufWq8dKYmn0TGVXKwWwndIQ
qOCOEXOlObYuDbXJmcCkCCsj28qIMDiy622EHVPMlPnWnSCtHJ7smPBG9iYl
NhqTYU1wNPUZQYLtbKSvEYdbNxMxfxivj9y51UJcvngHkAYuKaMuqQaQWjWm
XA55kTBNB/AowaglsgUAS5ckzWQ3c70zNJn2zInItSPI9YPn6ySGv3yOHy9l
YfSh9xLgzOYw2LOhIloY1kU3qVemaWWkJjGyyMwGBNVzlhh6Zq/Jwaa7XMGD
iZumuP1oZQXp/kYi2vBQDABrKBuFbjq1rrHtwFkzE+PsmT3im2AazF2dAC4r
w9gOkaD6iYFCXtoSmPJ1WrH1mmPZb7NV92OKsCO+JJvJgJpkJHPT2lMbVaF2
3xYMPRj8jh+bif+clTfSsyEdE9pZsXqkWy1KW2RcFybI5EXvgQ0xQlE44fO/
Z2LzVZugiOYjQrNpycfDg3gMo0EZXvFzv82IQnYSxM8QMHG0x9n0ZqGiyxSe
H5beaI7DHAP3zBH0ZPNifZ8J+AROW6sYLMzRYAPxODylrbdG6dMutPR58YW0
Pt6bjd6zLfN6nzdZd5Hgb0M7BoAD457zSWTqNxAQsf5iSRHRVmHqACVWxaoj
tle9FHnp6YrQYst4QKacaRaPO0rCwmoz0aLB+hXnk3BnkhUXtwQO4sLqOCG9
qLW5vt7dIjpSzMo1WNWJ8vDdpGjRSCEfb71c7758/kclTeKr4FMpX2PEYg46
fiCoEIEp8q48BqX4vIHruDOXmArNgaGCVFe5mTdLTUsEwZKOXQ0OxCUDMCjR
VEBsr3d3IjDQzTQiwSGGwibgpVDA0dGFWdGnO84n+SJ0fc8zcYYeRORe43wA
n71pXQWN2IGHziyILtITFzjFGF40p4xjWzIYOpucXhKo2hAkoS4eWptGRojr
0tQM6UJuqlxPYFz1IKm927qZaDQoN6GNoeZjImB9GawOzBlWZTGIllCoIMoc
lq1I+q7Wl3YwSmvVhubEpii2kKUxq2VqIlBW95QQjexLChmvY54NXh6wBWLI
Dqhn7j3iPQHwX+rLaCy+KBDh2C5oqTxB8V2wB0roT0l42LHHXuiY9xeJ8dPi
f9O3/OLzd+/V0wYwHQ4AHnFcmYGXV+FUDO2XiCNaZBq6+KDLi2dZ8rFQfGv0
YEEoD6KZlP055yP8Sj3hy2CCGnGhBkCms7M5c6hMX25dYKbZBQZtRvxczWv1
sUCTdLoSt9ZTHrB2R/lzPhc3inNrPeKGy6CUIxhcOFZMbOh8D4yLZMRMD+D4
DefJlmCNJ7gNGXRwhEN3tztfwiHSuRg+PMjhWu46ZkFn7BMB/hDcGJ/NbQGD
0MbaweDy/Pg/Dq0I9PYjjcYDC0Y9sOK8VB2CROUvw9t0ehNOWdAUyyZ4YLWK
VHiJIhVwUPluSMvYwqh8A65I+GJS1goySGlMvupIx/DHypjWdyReBlvHfMgy
eBvtwfwdx5yGgDMjnTt7IWQSESBPob+PEZjHqh9T4goMMKro2RZG2H2nLPpQ
TrZlVUUlsOKT9AvfUoayKJupCsY7/GlFm5K1Ku/OH62m6YToY6tcTa6KcT5s
Y0m36YypGitrC47yYmmm/+Fgz7yGOYvW9q743LWhZ//UBUZSPqKQEftznX9h
wYUXI/N7eHawaqn38hMs+whpWEpIMxgDBD7rDBgWJQvu5ZgkksV4pSx5lqWf
AnDScuEawvLOiBkvEGuJICoblLzHygyw0TuPJhkwGofSIZJEXnmOvii68kki
OGoBbLcFfMJYHNTrPokJ+fSuGEucB0BzHEdp2fg4RH6SXrePo/cxm18JP5FY
LdNiJxczyL41LCB41zpB7jZhm5SbecH124ObiZWyhasHEaPiPuNjSPhB6t+S
c0EwYz8zfyOP/uFgsHY8CI4dj1YixGIBm4doGTD5+/fTgNYqlyQfDlm1r8Sb
X9i3xLtDLGJRDIuxs0+odyl+GoKui1c3rQ+H/XY99J0ERJgY2No/HxFlHa0t
p/qJEHGcczQV6eL0UWyuwsfg+lIfVTmbpyt2VopKTnxUhJ3AUamGflXF947X
3gz6UZTk24/9f2dN3IU+X1iE0ohqgGAsH9X+x3qqQpUPYWzjSkwFM+G1yjU4
i0SdHJJNJ7AET4tFfr1qVyIdSdCA0RUM0cYb48Uc4TheaUiBbEwJw4XzL4Uh
tOKdFWeblYxLEiolLpTuu06hXvbMiZUSCpKNuuouZK2UIz5MJbSReEHOutfr
41NADz68YzaKYAM074JDYaqUDWxxlN9MlJ9CjCeSAVFcT/p4ldikFyIZVr4N
Q/EUx0mEYzVDfcVge+3w3PMw7uQH5M36+Ah7M2gFUwlPoUfgmayaQqFgRtkQ
Qq+VBgWxp8RIb6c5sYzIry12RwY6RFTHqEnrL3VDWPFngZgJdhfiKMHpBvii
4WXQqDa2u9tb5u1+Ozju02LaBY7nN8tiCVIRSONiENA163SvaefdLnKQMD07
xbNmH5zFaj7Ejm/p0bHYLu26eCXK/VhzSUe0bYiysyYVdaiKUYg3wKLzFTOu
tHQGiXGxKDW026R3yKm7goxPiIZDhry8WMODcBREfrLzaSjDitzAZoAvfIp1
GWcwh6g+cpdzYJk3pDgpznmpsW2iBNhNYydFat0URsSIL1aKVVuVPALF864Q
4gCoE3WYLpzcEUiSkEg/MNE9yDiC8wPHLF/AlkGUPjYIi8Mfntu8ZJW6gEg1
z2oJkoTJGnmg0RyZhB/jiMxdkOY4W1jtjo4KccrITwvY084ZXmpd882xXayc
heo7LedcliOJOtFyLsL31kdUADo3GfA3UHztAmTCGrNtH1VHMMvlkQffepRF
5ZYZW3MieDTP9YxohM6Sfq2B6HN01msb+NXj7zjce8an2wd6OWOyuA5YYRlm
eZR3YaO9gETC+sDWWHqUUMt8AvfuZMaDqF+cBWOXCGMluyisxDJe0M4cqHR+
+fYjA59+s0ulwsRg72vzPogBlWkLURoXhRopNGKRjMwGqQbBRihhzfiBhVXY
GWFSNlfDoUTZiPGVnQ1iO+r4I5iWuwln2fDsX5lN88W8o//fXH66o18H9H+f
/h9c8mzlVnag6WN0s/57FRgp6zPRu+kfj1x9gN5dMWyyzYqfdU9G/1o8zp9f
0XgCVk4k+2Fv7eSHve8IRzPwdNjtSTBu+7cf+OkiRpZFauIDa2L5dTZY08I3
l/R3u/n1F5Yp7Bp7Jw3JqhZ/XrPz8q/u+1fb7VBThs9wseHZ+uefAkSQS8FS
dE94KdbE6/EK4FOC9AAAYe+lee4f0S/a+I4xR4MdunB8erFDFzaCVx0RrSHw
btJlOinF0pJq8ckBfdmGLm5EOgQMzba1e3aaJsAaKxQYKEXmb/+J7eyYg7/9
l994/sTL9jNxIgCx7HkO+j5JZzObpMLItevvPnmzZ1osV3XfYKfd0+1dh4qv
6jtFaGRanJNOosAPS5g2wifjxwhidP8li0bNkG6AvXAvHeHV9lbHj/Fqx3S/
53e82gm24OQHt5T6jNxSNvAARxng9JH2YlZ5BlPHwpEfR2kETwA2cDXLI5qI
TM8w+cKB4Y3pGJ/XEEVB5ZaalDSdS5I4F0pSAnJi0bbl8bSy27pqNukJVfLD
CXXq93DLHiKJZpJvDSKnNE9EhhyR0ez/4Hezc1jFPFrOmxXxM7aaR6HsVqoY
sxDR/ZyT4P2ZTiHyw7ueQMmL2hxdJqYUUnUWGnIlILISJTNM0sGUTMuTNHtx
XrMPd+bd13vBbDSqpFLKQm/PM/VvO7+zjdHgSBAbhvWY17q12b3IoR2PCzqh
z4I/zBmJgt331sABi/Ekha1YI1zHWS+oseFkE+fHTkgmzdXBG0aOQEmwGsQ8
GytVezS1MZmNSTsSLzsqYfgIhND8yxE4f/Cg3Ns1weJM612WXnfPiUpkhFp7
bhV4ucd1l46x5Pid2Ou12Vnf2okko55NZ64n5lknggYvXBFaYBXsc3COhjHN
yVg50bQQ465fJcFXOhSb8DW2sMFnbkUd3neZCkSx4RhSTOKSmPS2njUUV/M0
3Dguj0WTb0LDV8IzCtbGCRZT2Ur3RLgE9jlyQFrkPcYtiQMAPcy3sG8UO9W9
LWZWTazu7/6ueda8v91zotnzUrdaQ/tHFbXe7XT2RRPiG/YX08ZjkljDjN2l
9gYZuqpD+3Q5sbbmU1gnYLcsSLKcNeTTzQoYPx7eS2tLa3BrWfdRPg2GxXgJ
PQzLlkZHlQ4WJOyCak9Sjm8MrvuwpDSIvLUJWlW496OUfkSU62G1bk17WOmU
+ZNvGbVPJCmaVJuqH885yUi+rywc9F2PFMMA4Z+kNFl/nrH6d2kx2XvUOGvV
+e/g/srEl9WUkGtVCVVPJd3YL0wTU5MwFUdt6KEHLtg573pTwRwRoLdwwxcQ
9GA/cIkwyovgce5aCz07cERH5rH166qj1d2ngINh9Jdffgmkkz91G//9Kbjl
nv4/UyVDdTW+XLmFFJ7uHk/43X7b1G5xL/rTgy/ywyUNk6vM1D54/4h0dZ/I
Q3f2mYZl3+lo964qxaAoxn5U+mSRmq5jzHvT+jKwKNiO7mytDtwXfOcbhAn0
ZdPCO/n6Cenb+z8aGfNi8GqHxN7Bq9O1nfBOd/1kbcevyEKjviL3O/HTegg6
/g6nrLI9z2qs4R2t91H6yDkK/YzbfpAHEAkTAcb9tGv+cJ3fdGewDeix4XoQ
r755lJJ48cISn29+ZoHo0afsvRpRH93qXm/ZbCUez8zyWcbZPSTIFEFsQOLp
C2FCGfn/w1h344K83as4+8D6NhBD5aMRXWoBJOZvzbffWjT8UUQIYN3ut9+a
U4ef7JztiN4EoVvludcx11LvXkQ12G7CaUk8ydITUo2GeCCUyeqeiamYJ2CZ
gWYXuXBbFwNOJbauXCW9PF/M/Tv2+tJY0UMH8UMqJmvcAPtrY34QmLLZL8Om
nNAqV2EVd3n6aNGFnkBfEagC/BNLA74e9gx55ynz/LUKebFX1ew+jYnGYuEK
9+rR+MLQ2U4sf+Xd4xUft4U2PcAO8pBb2iEJEp0gGqDZIQ53i8S7yWMK1Lop
TL0SgC1Oy0M5V3rEMOCD4o+cRmXJ7kEOmS4DlwQTNqJckfcCmPvhoovldkul
aC4OJ8oVuCJR4lMZbwuM4rq9PrAmuGhR7kHno1afaANnH/L6wWl30WY19/T9
xaF5f4p6XKfnR4dnZrB38cYcIItx8P78GEmKEgYUWGCdCMMzj2c8F09weDEQ
G+eZN8fvmtZGO0gO2/UZY4G3q4OIBzblSC5kCWN5+ztSMNtBgt1ubbdbF28O
2a6OFDtCO34czz1zz9Grg3fC8xHkwD05hV5U+DGs9eKTxkrznHFr2xp5bdkX
i1JR3QYfp0CqA2kRHGiHchdV+3dgL5bocA1yfTCPzdfesT5E5xNhLnKV+bQ0
HymryTj//I1P/oGNT57ceHWodIJdT/55u578c3c9eWjXf9u2J/+cbTfmF5Ho
sZsazdHtfm/+9p8WcH/7L76AfFP54pDO29np4YU52ts/O+7Xvq886Aemt/yZ
JLpgkq0Q8G367vv7P4vYh/PcctuqsqB8++jzdkHMOGK1A9xiz+pial2DTSFM
HuaLQdUijqJMYflOS1B4jQqK6wKEHtMGniuBV4g/QnwNgvVY4vAiRD1srxKr
Sfd2U9aNwuJK4sGUL5QVM6IijIkTJOe2xNRCgqsh9lb9lCQ5Ea6bY7ZyuqiO
n/4wWi66+ejnWvIL27tsEpl1aMa1Mfh1NuMlcICSgHrPtx8fkEZwwJUFJDH5
nnUcEWPDuZFiY9WA+0gpiP/yl+UF3XMa8VwMSOdiO7rnWmya7cpGJRjK2HRR
y5ivF5tyuRalOHu8RaljrpawnCCuo5jPJLzwsH9qLNmwDsGesZM70uWyI1Wt
sffm0B9/pW6RrIDTJXJaICvgIr1jzqFXboZRubG6LcJNPEyiBDoFoTY2ud5P
+pSmiNd5uUL+rkQ8SSpvkMBTf7+Xn4I58zDEQa6s+0Cs4zrTqPKAF4U75oeB
izV0Mx0AuVTX6VrVG6FG97ELm+N6fPp9k1AmAMI6veOa/2gS14LVBKS8vn6f
SCqJ9EFAT0V8LWVR57SoQ48hh/BmiLM9wKOnHfJNtzjHu99GRhEOkeKqhqG/
PrZau9XGZRXAPa+XU+WORJrrRSqwLOjzREu6I+QhqB5Pe9sPCUmYhKdae823
zxP70dZyOdRaLlEtFiJhTJxrZRw0JtpqfeyDtIlqnuDFZWSgpEx8MDrHdnII
r+WydnShf6kWtlEOXqoh9I21zdXX0zojatw1J5p3XVkIvfLhAjcyhxKJ3Ncr
fnV1NWwisHtdWZYt3MORhxIpSbB1sdJM0rnOnT/u33Ecj+4Y+II/xqjLXVZq
hnSqylJQHgTpTLZASJvVvn448ITICeLcSg7z5QOL8Cmfn+7OGWKt2Mux+Xzb
nOzTuC1Jo4lVsYZ3/IAaT+KRxvgbuMCuCPsOgiYG/WHAXAD+OrbBdkOpUmIr
03yuO31uixzUcbT14+HDO733SKUiZLFyOGg6XFSKN5SRvaX0tSWThqRYQeEf
DxVvZPsfxpsIWA8nWJCyR/o8znrH/Ec+uxbagZjHrEungCQsWDX4CNVGfTgx
o7Qsyt/sAldI1b5Lx4jdkXL8gyIvS/BhW5evOgF+8cFjhVAzV7mWbTEK6Ab7
D7IZQ4X/SW+Lmhswg+Y06OiNzkziwsXlONKjQT0tkDcg+k8/4WOXZLhnP/8s
BViU7HK41LeWZkvRgClqSLgAwiAeFiKOBS4rmFgE0fdKuJNEeEkVkD+/ovMy
oQXnkq8p2fHs+XsYjTX7oROHtLkHIqEyaaCv8PP/9JMKqj9zKSlXccMGZitr
s0G/9ir7WsG7R3OEJLswbnhpUeYC8ZAk1g1BuFFGXpLE3w6OzZF7A/uRNUDt
Iv1STIvJigTnT7O866bxs5ZqKLV8QSg4v81W5oFiwT5lNgjKIqjbJhNcIgcF
hXmPtWA8cxwiWNdEldhPh6OgRXZsngKeSVqQTwjy8+51OpQYR+vYRYayBt2R
RmmfvvCKK7+0JbKFiJiJSwjxD4pU25V6lvKIVbsc1oZKruY7ymPJGyJahGhh
8WQnNNKZmBQkExRzLVHHb6Td+Z7NW7taN8gnrfq4BhtI4QqRYRMkHxr1GJYT
l6ZOM6bxFlItlgO/oIpaOSqQdziPmMvOWT1f08ZEHS/9S0lD+54jfF2BfuNq
5/Oq/Dw58Byk+3qstktbvDGfjpZ0aFc0VHEFIS71VV7CBO7GMg8sjKH4dRf1
8pn2r+g0AWw0HqIqNPQkKmED3jJjDzQ7c3E4ICuFkaZhQdceJ4nDQ/G91twh
cUVC5G9VbcxtPctfOTSNmJdBQSkwC4gO4qFmww1vH90UdQOpVJU42fsrqA72
N9qg0sbSLcPNKCUk1RXF09MCivAd17UztjgdDXiloiCKW6VluXad5mO2h5WS
JUYopxH4tpAiK0EHLgjAHw7UfaURP4OF3HGVrFVop+g0JH0pNs5UFox2hWXn
RlpA+gWo2j1p3fmCtXK3Rff2cLSOPW616XLIRQZcCcTq6fcN2nmzqt50ld2Y
4Df3qExwLzUqtARFxOohltWzsn21iXtftwH1DawKVSvG6ypo4c1gcOGLnyzU
cB8VUbAvqUzFccvS2OVdXludn1921FAEjNbHgGg9qL63+fXPwkWq8usNplHh
M13i170csGgdDrzUYcnbjIu+uioEMovn8RyOTPjkHDDu0ugznsLh5uGlHUzn
EFg+7D7bO3jzg00PjAMgIJBUK1vykfVWHJdYlvwaHLA6qS18Q5y8tMpp0+FR
nfQBLvn40frHz5A4/c8vJSXwXkGwBiD44hzV4he2/r2rhNBcYsMaLy3UwtQH
hX92x8Y6OwuF9gPTqNba+MpKH9VpVM5WZQo0xiVEOj+HNfnzfYQjwauYF49t
4QIfUeZtOwEISFxynuOl8BK8WVuO0H2v99c42CuAAEr3gT3ZmwIGSMygUpjq
z1dzs/a9lDFRqz1SXlwEw6UdxZcr5gDA2FvgxKjodq7lLNWH5cRHD9k1Kqm7
Xj60028/Xu7/aJcabnBDDwIXQ/SISiSBEI50sageQVcN+JdneMk9Ti+/mDN+
SWd2ZatrHEIhIxxbYnrl4IKzugWnI4tLNrmwgRx4O2kjRagceyUKNQn4cXIA
29fTBEC34AvBNyCiS6znfd1RHWyhFHPi+EmNq4ytIX7bnLU3eJXfSrfz5w7w
pOtLrt9DdZgCrxeOHviS2o/9mw4Hl7RHi4J+VBd25vlHsAqnLT6gQ9f1Zmd7
r7yuYXEerd0wQXijJ0+HNtzBxiw0vUsP41H/wq7oCPpE8+bUoXeNm60hOqjn
Eb5inz0Ql2EFlXvzR1jBkcSt7g11U4Q36bQbzi7sJvZcngvI+qeXqhNczuUw
4g1HQX16SeUtjdZyZWMBPdbtH9o31Zt4hK8YHPUvWXG+ZBJA30nNRngaHGmm
m8xg78P5of0y5iEc2s8ADUeGmb0BPHvaVIV9aQBVtVAybGxhKXUOC1k9thLZ
kAFD4nI0L2YWWLMZzPUDBRG+memU2WrCuyEluGWfEPLBe88FPK17JZvPi3nw
Kkui1OUTkqcaAYoIk6np2KBP7i/2lajeUnXVJfc1utQkpAAOSNj/ENb1EVy0
susf2YEhdHc2TqcZP7BkQy/s0TakW3yYdZCzNQnvUbf1B34yeskLfslFf+/E
Ouqcryt4rfp75M18aA/75pBhrYznz2Yj625sEuUqF118u394BvTX6iRyb7+Y
KyE0GRuPSGeYwmNXRSOGzVnfPSbYvk7/a21v/pY3W+y801U4gBz614O19+eD
I3O+0Jg8HuIateaEbGrYk6b5p6P/Rt8WIqyle6AKWAdQEOkfBhiaV7+xTmBU
28I9F+tnJvPDMlvCt5ZLCLmtHieV0KaFLGBN66uyrDT4wHoJHF6GBYnvzUsa
WswuBfRrmGzu5YbX2fS5+bKx/UDJSytWQMbDlmJgfyJuBb+jw1BHej4T4mru
i2VtZTa04kM96MyFg5RJQ7z9o02XGqPupRyeyISuTZS3kzY2S+qR+DDOPwVl
R30p0TJ5qHqo9BESc7DUCNX4AdvrplPnAmXCBbwh/HXZSkrnl9jwzQ2XdQiK
UHlWJml7YnVXm4OtpiA6Th2mXnpKkm+/da1vOLCvcN3ARIqe1KS+x6SNxG6I
k3kqAaBAGHZr2u9jOz1/Lcwe2wQJzRWjRpStRgNhptaHkfmxVJSV8EOQvh8G
bYciYZRskJVAGrDY/nqkOMjEbEE+FEl8NBRR6v+JlywRL9kQ5Aha7khrQ1Vk
Pr7Ldsny7jX/Us5K06jjXbMtDjddn89hY4dbhz1jb/c7ZsOc0M8t/rmxzb+2
+a9EHHLWHacOJFc4VujyBslgWigMIbLhvDq2Gj4J+YskxAdmsYoU7BnSeGrs
4LVPGhIP3va6an+iCXJR615CWleWiqq0QbPvmB3MvmOebWL6OrfNnbCTzg8D
BZ5m9HvhLPFRBdw0rMAvxppA6wfenNFE5lE0YAsiKMGiX6mIV2jZWocincRH
Hlhzjw+x0Bo/HUcYndO4uL5m4hIMzqvgrmnOiT1ncMCUGTk/18WcxYCLvDOu
VmIqIlViA9Z41WeuIvgRG7xl6WxEDdK/XIwbV9GXbCkOo+fy4RKPHuFpa1zc
SPnLtm1A9O+GUKKshlgKUO09f+V7esm5S6Bjj4wrfmXdwCh34o3yPBM0KtGC
nY5ZAbr4MuGxLAD11IyECjbUoxEl52maZ7WhuFz/A9obkWxHsIwnWMrnYUfm
kinYf18fv07OPpS2KmmJiuoW56pbXgrjfN7b6LjoR1vhOFRt4qk3EK+es6rl
gevLNQ3T+jG5LyOMxznKoIhpqZ62oGBx4FmEE1HCDuxjQRRV67oplJvtLIin
s5lYC06URfBruMr/gRzkAmLclgnUl1PAOwsh7ogoPBioPYRyGQ4cwkTl4YZH
Ucz+ho2qYWGKzed/RDF++vGCPrF3UwUr22NCnXQoDxTFUkSpJUzjLQ0X6s1j
CeluJmGsmLgw14crEVfqEH89yZG2A1yxFpF79MzGpififGYRjhFhp1BmX/ra
sV6YqXlKz9dJnIGDxk69nKFyUVspUlKjSLQfdEhtTZLfTNgkgJLZtTgdJb8u
5nTBJvNpZUerrQnrIG2XrlAlgqVYyU59EmX7B0ek4RQLKQAsz4VBFUz0ymws
WUiVBmd+ATEx63sW2FCiRKLxmuhaWGmG24elj1gPAydgohWOtdOOt9IGvFgJ
ABihHmo/kqon6HRXZg3Uzgtvp01DVmQzGc9GRZ0S1UfQpI0Y0kheL6jOdOuj
DC8bxU10czMQNhIrbEDQ4Ne0beYS/WHTdsRPqXWemprAXcvBtRjFdcpmUMI4
0OOO6TGdMRL9l/Dgs3dEyx55Gub2JgxaFzvJvMvTcXhYEtxi8DaTiMaWvFbe
ccNyZW0X+pm0ltG3OL32xfLVv6GewBF9x4o2aXbZFyu1605h1rohSahER+IV
e9PRbcfl3j1FlxIvCknJF7p6+pupAlhjl2UXL+0QdrVChGSMiCWdTjNU9Z5x
dr1QcUfqxYXg03vmqPpoZSJe/b8dHeuKnEh0Gsw7nU4LaPhKDaSQCKfQn2st
OkcMgtLQEsPzFFV4lA6EypjIvWu2+t2t9j1fcfE682b/BCfRfmwFlWchfRDW
+LsSzWWBwH1wtkd3q4SsQjhfCwebZ/qtjSDScU4/nmRr5+cHpoWceNQF5DQy
qDZ1muMUHITp8M56EapiUG/SbEM+wKUFIo4OChCqXwEDRylR8/dsXnSHxWyV
eKoh5MzTq9brg3MeGstieGm9TxbfXLwJA7X9yKGvS+Hh3FsqK7xsB7VnCSIJ
H3Y+uMHKen5QqUMQji7VoLkYCGic7JjmAHHp0t+Lic+ReW9nn7j9VCujY/Cm
QkerbDxaaN0otblbTVEOwhdBwULb1KORjS72O8peXnwuJOsT9kYppcKpyr6K
gLjqUKhYdOywVoGWsPWqoAvKD4PfkqCDk2eG/0LwVHMkKgFqEKoI2tiERg9F
Um0CbhXQfO4L34Vh+LV2gEK9gpD6GGicYNVAruwhlbBIoIKNdZdCDWFQbENR
dFcSPRS1OrbvmSpdDV3hHjc5NTeX0vj/Fns6tYNQ6ivYSfV3OTYqj3BJdjGY
bGwKvdQKJyQZnC+vYHppSHpizTxautCgTRJrnm+QTLOxvklSzeb6Fl3YWn+5
TQLPxstNNapsP9vZUpe5N/zEw0XqpRafcnGpNguOMJJb4BS75uLS1ay3Pe9t
BjuYNtHui0vfJsYhl62UH7b06wShmAt1U2ufs4tLOVGX0E9JgeR4pCA2Rt71
CG2UCJ+J0sR8UVYaEka6lKpRSVWJ0h30C1qTYV0Okfjn2s2kLKlRxK8XgKxi
Nstspc1Etus3U1VOHoKwjFLiiBtsQK0GS4+HY2Th0W70NJgHLKE9h8g6DAlR
QYAcbSuXTLalIn1MqcYMsn0UZexQG0sEfJ620BbfXcG8D+jj0yYf2+CgrtTY
qi+66XxUkxs2zXM8oOt7UG9pEIQ/cKC07aBxtarivYvnbRZbUq1EXu3hI20b
Tp1ug5ZvCOrk8J87GtbTm3LXbAw2Ng6IJgw21g+E8BBpGOzQz+3BNv3cGWwd
gHAMNvGL7j4IiAMP0kFWJRuDG6JAaPVZOkE0t6REprZxuuu4h2kJSUx8Uc+H
kxrsspDQoM+NiV++2uity+r+9p9M70DY/vZftsSi16cYR9kegtA6/uCjplSW
9OE6ia0PERg0HlGsxmOJ+bbVUp0sFZD4NIiuw+J7iWZ8Ct4OiMUTnkW9NVqo
FsaN7LW+fhjo9fk2H7sAm8Quz8dd2pQBe5Gu/YO2YDpjGS1UmB4TBuRtydWg
h5D2fuT1W3ttAIc6+bD9gdkFT2sH9YDWjDWtYQnIRCRNSrLb0BkFKlNbKKXA
Der/lEU/t4QE5Tu4pL7VlN6E1Zmi2iVR0lsTfdBw4SwUD6yYcy3tz8MCXkEX
Su1JIqUsnJ30jnhHFIFVF1UasC8iBSyexplyYfxNbEvi9CFbfkYLEmmhHfoD
Ns2SK+Z0xOAc9gR87MktOYkHg0tXpuTVJjQ0OECJkg86KGpNnw4G1XHco682
wxE3G0bcMi0Gi4QAy1APajsP2KdEa0KRrlBjd37txJpBHCvruAMkAgKLAlYI
8Ae8ukkcjSqLFOf8QTaDw9uJxF/jgLDCFmx0MG1XqpA5ua+lESsqW0kfwqma
sDn+ocFAJd7hnK01Qcarzz2uo9p7JOcsr6QPfVxhCBW1iIMJJ9DgfJ0dGqAw
LUh8cT7nsbYcztHnJ/QNmXXijRVIqSM2xpetzZY/j7xQzwqyXL1JZx0fZmS7
ibqwhhF6PIx6CF4Wclz4JUvUjTJ8tjL21r+YVpkulhz2JLmOdI0kG7Pe23z+
RaxxNvD/IVz1U69ICzxF9otreFR1uhwrUI1H98b+w/7pGqK0OHLAZlRWkoKT
5gU2KL7PdjnEr6Ejkdd4zU9/8HlvSdLQekmaLgUdlZqbLomBicN+mV+iUkOi
YcOP9FZCnZ4H2ir9kxTgsKFqU2NWwTXuv1BRjIUouNnZPjW/Ke6iHlRZ6bdD
O5U0BGM2OiXr5/y17cKXom0D3Y2eAFGrHBcaI+YqJoHc5YFbgNno8KtshNPu
tsxuk3N0cLnGRTHrfnLlNCxy9mzBN3nhLF3x+2bapA6/fW8qLlJM+oK9+5Vp
7ZtvzSf6/82lJOh8a6sUt82aOe3pxPfpVumhII5sFRc79OirysT4/HQSO94r
c5sTS5uGTR5tHeSwerdWRkZJ7leQxE/p28OBFIHllwL6/XRaTFmhwZr43J2g
SeWXb79FNKCIJY+UR2yuCmgOSVgWkLe/6v5PAACtGRF7dp33JoBra/8VC9dY
DS3l1cttCSdrTCx66p/5lc+F9/vPT44icYkbvxZ6ApGdX3n/ZvAZBha92oMZ
mHMgGh/DBDcbv3ryhdtbv+7+reDzi43tHZ3gS/PkBJ/9tgnC3v2r7m+c4MbW
0xP8VZD47RPcCT77CT7feXqCz3/bBFu5z5rs2lYVKLA291qn6/2KaqXR0/VP
D71IIjavxt1JkXW5R+4XG7X5IH1C2ObTbkYvX4VexlpKG/yISSxge+dOPhpr
VvTnNF/Y7gBuDB8EUjGcJb8hiOo368e39JpJOtOI+4jKVyxiAdOp+hEbQ6bU
5z4sxigBbFmONoLSgvA+8xl2sloAlIkSmiQ4WOJRZfFwoMDS8DnThuo9oGwU
lXJIjLFjWYPjdxrcZhO5NStXpoHOg5wNfnzNmx5UyAiTvsIF7y/L/R+5hKgH
LvoHltIWsyFhRHKa0/FNcXktbS2iJn85kt67G+21qXL9qY0S9CqizSDpuUwx
nsbjeWIi0tkA7xOorqk4MAQp+Uqf2yvlpaJVze0q3ZcaivO4vpd2uInrx2pl
EltgTPrq6M4OuUhlpd8jP9bUJxGNIPhZxWlf6J/bOmEoF+0DM086hjQrvj07
DemrCJXop59AQPQ6l1gvGUx0nEIo2SINIt7wpd9GG+PSZbaGj2rZv1kyqYkV
NSFDykHZHoAXLCj+mnkfuuaeottwqf1PPS41CwWqHEpZBSsx35sj2Ho7tqBL
gwZwBxenJE2ojta/LVBhroVWgV89sUhEV/Ppd4Z+cIWpsB4fsoGKadfOKHJ9
fyelUMyb9++6HBmp6fo5+hcMZZYCNSaV3D7XHJA+93WzhCFCI/IshJDIw9X/
y9LnIL8LOmgKbnxHEAyi7YPoPp7T3vILTZHTi3H0vn5j8a501FXQjFR3gD7q
0gPGGJFglqXzbnUbv9Nmn7cFSg8UC0ME41PAkqMTZbkyeHHjsWrkytbAUmW7
trJYWM0tpDF8uL0r1yGe1W8j43yjIUnLwzRQBtNzLFUaXTjTafj2MvSACUPI
g7Ybwm9skRCpjHNbIU624qZQbGmwXDdkfG3MXP3wRWS7rLB+DZu7ylZFmOQg
ZTM/C1drhyWZgzAZB8y4RXRFYcfEhWu5NzM8g/qc3Hm4dTh4tcN2Kn0tX7Hh
9fAKI+qN1BbXZFrC4rYlMs62k5boFet/CkUZdja7dvbTUcw74rBl39Imki5Z
clPxU5DGy6PgJZDuKpGSkFy8vKP1M23ZSyeQPBYcwy60RoG0E+5G3U4cevRi
p3HYPBRm/nBTrS/YymSJVE6RPhOZNqsbrnZlw3yliTW+cOovfGtap2s7bY2h
qGN0KoUPK7GhfWcffszqzAanJVdhkaamlQBewp7aQXDhIGljxUXXjQalXDiN
Qjo7uQj3MDmvwf7MKUS33KzSlkMsczilUvblEIZrbCZBLm3H8ZlJNT4zrXgs
OVlOnIs2frspbFsC+FpXbZ/C1QQJrblXMVSFTdQfi0io5pbD1wBsFLtuQ442
IhL4BtvgUFsLjVySJSKE6BPHhoZuEu8soLmS2LdYqbKRqJ/uOngc8/B9iyo+
ELu7wzFaxNt3qBU6iazQmobs391gcN7aBaJaHnLiW8KGEVbHtVxUF4gkzETJ
jGXCQhLy0PvRSXy7Fth+WxMmaw0RvQL8YItjApBYM3Buk+5QfvZE6zdJMV0/
3+Mp7UWjzZeEG/EGp4uogG1QwEiqss2ls6/vz0DHKpnYnsPV+seVk+IOZbUy
ftD0Ke7Sw6W3QMsVxA3hlD4Y6qT2/taJ9CQMc7rAi5KHJsZRk49Nr1bPDxEf
QQLZA8ADl1pf5/yK5/p7w8VnPufoTM0mS3LbUiSuWKyZ/C1ObH886jKLZiFE
5jrI1dccfU3Zl2plCerhLOy3OZpxc2YLulQh4GHhxuMbtSZsQ/xxgkLU2Z0q
/xlTvK+1gMCDedSnwyYFiBV/9zmltM5TyieFpkq9gLDeqaTAFrNUQyhZKE+u
3Kuqocj2hGsbZy4U0xR594hn46FSC7vm9FIccoHT2Gd1tey3r2I07vjml1ir
jfdFCRRsk+u+pu0Hi+vk4tIWSGkBGZclkHCCn4jQaAcuSADOYa96HoG1OAkb
jycD4VGBtngvVNCRbbAOUhFzEqQpwlPH+XVOZxtlxLgg/iNkC/K/FNmLSD1Y
3dLSQleO0wG2koXmclcsZbeFB2xbDRBH9mExGVad/ilBZRg8bHPdDvo/0NFD
axDUyvX6SKI1B9oihzwspii9b/C26Q0siyQVCaGOdYdI9Cd57DYSOOrvpfO5
E+aK+QAw4hbaivHxCaktsUGWOnkMWNL5m0u5stvPtTt3dCNQq9AfbZ5xk4nR
I7hXewnHT7DqwYdBpKtyQWi36rI1khM36QSE18J6L74Ju0vfjtGwo2SUVe3E
yxh2HoKhTYkqmlEYvTmAgqOGyKwQqfqA7uSUzzNn1XiKlwcxQ6MiQ0ukhWGr
PqEFBi95cFi1dezEpRuqnSIdAexovjduqMIReJorCKilIB9rHtpx9VvC2ZDu
SjIhiblw1SKzSFyh+Xy4HKdzWyLFxp8jil5slk4IsaVDesl5zOXzKQD1yOEr
pGa3xK8wh1zcoti51BxNOE3UAS3gHh0HPrVjZGzFXAnx8c4BR/6sl8Cpqqjl
p0mNj+A30Q4mhcHODZlhjExrlZVr06Ltp+K+S/x31pkv1Djkf8LmW946fVvM
Sv9EwnkjgcncU3IXsBStmq83SNjPd6N2gLabBSxZZp8tWWwNeawxbEPPDmt4
GakLvBTst2VREyYltjVCtcsHHmwd9k8GHY7SGkO60UNdzubpqm3zuLVDYXX+
h8Cd9PFwKg5FiMvCXAd9PWz0ooyoCYaJGBL8VyIxcf/bsiKHOpzWajbcLzS1
HYfCygSBjYxZkXQ2CaOnHs83qEyIIVtrUCJpIiObOJA8EqabStCmisgQjyGP
rGvmEh0LLvvWU59GlhDConi1lpacF1f51L3WxRI+JR8PC2vxYJGXe4sKNvc/
toMigALdxJYI8kmDnLikxYDxiQsCu01AmxUb2JTVA3zZCuFgpobP6mKCgL4v
ingi/r7JF1yY6Cl0q5pYiUPfcWSMFdBdGRfbRUZ7yNj81ijHg82BNor0i/aX
8X4Eq3DxsUvHSKZeJSw8CrjDYiS+WZ02RH0sGMhiSyX6Z8Cc3M1QTDqpTjKR
SeKOd9Yw2BpAE4wS5ZEk/x3dgevbtWyVdh3lUME5BFiwVQ9KJLzQ21yDzlt/
bNdt34ZjVgX96HuJpw6uOQKLEGQSi+dEZaX39h+D1LkWka93sD9NrrQdgbUH
nAxEelhN0wm9Nya1NRvBI8Kvi0R+jJLGVE7Uqoflhkhf0+CsiWxdpXEDB9Cy
HaycaP09J4ryd3U0cjUZqhm/6gpxVbDEdNdQBEks2DITjeBlPetPtCMwxz1g
hZPHZJ7yWPJn1pvw2LBtKtYelfltu6GHsenfjo49D6+tipHLqtr4Nql9ayu1
PBRiapHJiWCpFiFI/AbD5z0nijAR/GoSbTFO3AszKAT3VNQxjHbjKGPeR242
xBQnjH/Wbx13zqphxI+cWX15UK3G1cK1V3Hdn7ZPMkke5l6+YVtYJNbN0jbo
FEaAdNSIiXDRbsmpoMP9+M4zVB8e2FQHToKBQ95H+LM2yWtb7pJiqxLb9q6r
0NJsBbX2YDVczbQ0athj4H+W6dg3CagExtq2CFGALKQ37VQfRctWi1X7NM+A
yoQN3yWsXqjcRy2CM5BUn3eSr/ikGekr8j2DNM7Ez7frs2TrSofr3dvgWxO8
U4ujj7mfFqGJiFPRhTnHCZiJTcBs4mtPZWFGSUu2PpnmWP7K7LmS+Td9zZQ/
ShdhzR+mTmk8wlq9Czo9PrAO7hOOByXYXCPh6Nf8uzfnlwTzv5itt2a4+BJf
e7bZcJEgFlytxXH82tDRp55tHK/5JVIy8uh1dy9aH9gLnc1XIFdvLj/dIZnm
9Q+ocHDgQ2abYYNUAfN6P7q22VuPL3HkYf2im81+eONJPsJcdtYfm4u6ut6N
00nafWZerO8jdHCj96zyDhSd7D2vXdxqmqJOph/e+I4Z9smbPUwJyM5T2t4y
Lbr2KHzozZvV6dyblzu97dpsvofO/sBsDiqgMcenFzuPwgc3mBbRyekCklBb
n13vbb+ozuZ5b6u+UZsbNYC52RyS0EEMrJgA2sf1yM5dQzO7vLzUqfGnA/k1
wC/3Eu1FPwpfXL/WfDEozwmzfyXa88zVWoubwD0d8ik6WJ0UxSnR1dJUieN7
Een86gxo2M+dHyNOUn4qkFNjITnvdGVto8GSifq15LCruSWRTeSKZEoufQls
zUbGct7J/tHedUxfk8Y1O4CThZGlLIl76cLlvbJE5WwHQTiVD5N5KPK0Wp/v
H0vVLusV+WjHEltBjDcmyNx33DOYHjI+p5pNfsuRWvPsxilCyKerFqWReaEQ
2CMpcZEsICEVEkbXWFWL0/k1u1bN9F207uEIrbDkmNqKg1p00nn9IWnAeSAR
mNpV4c6Ko0pZweDjrhjtxJrm5iOtnaI9SYDGfAIypM+xGk8POOt7oBjZKnaJ
DYCA6qOxcLao1XNxIsQhC6zs3GbpXU4CfeUBUr6T4AEzGy/LULqJnd6PiMbY
13rlOa46F9V/8qIuKzTOzh0Jx97+qXaD7MttfiXB33jR93zyJ6jit4B2J+0R
2hIi5OmGt2u7qnXrnXV6EmOUKc6u84DqLKolSazMLZKrVloirYxAAw3+MS2q
JgiLpc8ZyDUhDrWD8CmBcSPlfkeNoTF1jJSmRVw3hgeZ65jGj1QE0XJh77sE
r4gqTOpLLJqH7ZTZUObH9NgxXj2CENDD7ENNufdWCQ4jjBKNQyGEl4TzhYBa
FBbT1bMXX0ZwVNN1822CotmypXYfz32gBpamRam9T7LR4OKceGIKHktG9W0+
c+6oeuSjtP5NmxQiIjREr0cPRPQdi3tEPZVjDrOrlZtU0q2vZfS+AzLBR5wQ
DaClvXwekQIjjrcgL9Xsad0jrWxYPlRdJuHoAQn5k+MUqNthc8KnCq3Ywo+2
KEG1TW19oQnPrWeiQgk0w7GWdJZ6gOlC0xksgK+I4Uy1SzNAnXHVUW6tzVgb
0pqGTTGaS0DjejOl1vNMKnUhG1T1F7thQ5IHYpYCH45N9LSJHuWiW3NS8DkV
HwypgrZZmWaWuviPC6GWcPqdsxvrtxT09uEk3EqYzdRSawv1JhLb0CVs5tKE
xVKN0ITek6gxjJZrlHYymiYq2ylulyxnJ5tDHEkhL21xC9oei0b+G1S4cGgt
NhnF0yQs9eFaNnWsQ0ADaYIeNWjX8Bg+PzFcM8GzAQOhRVDdwwTGLje+iXJi
IkytP5z4h33XHJa8PmWrIJw0MCv9weyzsfYcYpSGPfPi+z5kct/WbJF0iqdc
HK7tYeCf91TEBwl3kiAs80rfETXdrgX6POYEC46ELbTFIqytsWwx20+gTFqN
lcVJ9WsHFc+DGdt1igQWiSbJMCrCWrFZi3Tmj1LN2wAX4nKCVHJIZ9VCNC1S
PS0K9Uge4fPXAD31+EJU/FpkdYhZm1IY2YzMiXCFPhTyuglMUo5nKh1n6FVN
cxXssx44EagG8yybzJzEfqgeu6cwzg7CiRUsVdUZMM1zmAXhv84bWPg6NTP3
+q8peaFIptH8Dx9awpVFsrEBzrHBvq4N5iLqSbWtNpQKMiMh3swsZGOjt9Gu
VMTg4k3XSc2rWXYallF6ucpXLrDQcs6sR1DFEbDIeUqKjX2rJHH6bePyHq1g
BnzD5xStyh9HsWAj1zyGJrYcniNeTfELO7ua+YCB9paLorQJHPWQBa2fh9AU
eAz4sZF44TjVjXjPRNokKji1FEjYqkde9g4Fu+QE+Nf0dfivZ7WeNurUtOeD
lJ3Q7glJXKohLvb0cO/XegDQvLCFc2itpUsXsa+W9o5iVEnHlYbPVp+V7A/2
uqEyR9vya+8PsFKmLT22vrllH4KJO+HHPG+WeQi7bwXli5AKJfLOZw6U0YPQ
UHyysZRnpzlGrprvFHUeSqrhah0XyCB7IRQObwmKWTqx93GHESQK6RFdX3hF
qpVvuCZcojm/gTjEdiP0rlOBlttrssq3uE2nBO8/+mL9upfd2KBlgzIEr60b
Ojg7sDRJP070Xn2KBHN0kaLeyOUeWD5eGK7JR+ooyIOeNpfnwt0IwvbIVi0l
ysl1Q9ZQEfguLBdU2J6niToLG0LQzxepWPsUpzWw0ckICxYItX8OvgfEu8uZ
mA8ssssMzJY1B9nEp5NQEvCIVEEdcH18SngOTbfZtrl1VxRt0jT7bI8NxAIN
0UzmgjthjGJFRrSB77KmESxwYhR5BDkfWISGOvAsu1ElTvmGTVk+eDSYE1sV
/SwIsrYQbTwldzqmqySI28Qm+04A+aesDK1zFnl4RoGxBo3l4PkeAzPDSGJf
cetcFaen5FhXzi2sUX4tQ7MtxWGtqyIbpf9NbZxNPciWBljgtF4vCZn54WGD
jfFAlut1sVrvqSBSOBA+AnLY4Ql7M2W4lGIq2JtyLkOXYDzVsAzXVC/MlgIH
VjnW8L3BE4KfqR03tJ/KASWmgJckEgYayDQ+eXS+9JJfJVYzwGjJvUAchd2J
x6LK6qNXQiSbsV2ifUITQD2GGHea9Iqm6Wgp182WooROuGqIPg6iPhQdGA3c
mjlIs+bzSL6ieIWCxEWKNIhKL0lUKrScmOtTlyTvYL6eL6eaG1VFM29jULlp
QggJpIwwXqX9JAQcFw7w4f6bW903KIzpu5V60ekhC+oDodN2CmU0B4mP9vpG
Ul+Ma+VGQgnNtykkJVYrtTOtO2oPy/qJE/FiKZ5QeDnlUDinB1ViVsL52DKo
9MWY69o9Uga1Z87Zcl1x5dvCoxlj1MZzSAEI1dl9qu5pQ8JhJ6QacaiM5j/1
Bx/Wwo6NHe7g49oGt7jY4jzTbmHc3pdY2g+DRBoq6mkYHPXXkCMVJgg+ETJK
sEOwjjQeFxO+Kx+qqEd3EdIBronkafLRop31LiqEUdsGizKjjlxztcjQwlFT
/LiuAAlZycYfgxBVbLY1Tbn3RuHllb5T2iDzHUoSHBDSPKjojvhLG0JMRPRT
afOZfXyxz4mEjiqKKXsUJ+kK3+bXbHArm6qLPHRaq3F7Em9vfJMzacdgPwN+
8OxBjeBN4PmKr68UKYHRIPrCpoFqjP8jrRk96090O00JOgYY93zbttBZpN0v
HZTQFo92MZhFIhPWmqhq6sinUz9/J6gDt1xK/h0a1H0dYkoYl8AuqOl9NE5v
WIinr4oFst4T27FFLHazghNiJdHwUzVWTKi/tkiJm6Wq2JHoX/KITiY4DJq0
rEITqP+ZUn8Vj0Lu8DhNroVVOTcwqBzItJW9JfA0IIL1ZpoPCUHh1N3Gdyxk
d+3xVWnYCYWEeQ93W+ZN6nBrWWnjO5eH+arr5uratCauTatgSgNVk0LO0Gsx
341NmS/M/YGmkQTOHCTqL+dXthmv+pJZ0CSaI2neCmsrHzwZRb+U+HlgVyNE
vJGfIZPUIRPCRMHhm9ViYjaN0IIhqYBB2/UF04rzUFzJBwsit+QkhAjLL9Ww
DfPTH+b20s9JwiVIOPdBIyNqNUKCADcrjk/FyeRcecKuz476yebzrS14S3hD
NreRcOdKIXUqTNZhb1cDN4bGzSwJ7TCcHLPimsvffnsu5DkOqZFF8uG6tVOP
Sm7tqjN3Cu8sfAfG13qzuxFW+NQy0yRNoujywQC2kI4ZDKq1N2iH4dDFcHXi
y1o4cWdfrbvTIBxyYinodDg5mEsZZHXFdpxfzeGZsHf7e8OiYIvVzI6ly+Ne
gnoV+uiEw/3dw3rO9861iWxZXC/iOzTjxqAoTzEubvQcM+7RDmaS5kVb9KPN
a6iFpnzFPsVxR6ZStJ40TOSpaJedTvwtVkfg1liRBmmvQ+M1us9406KiwY1V
r60Pt2NRSHIyrPtfW1kQ41BA2GgjdwbjVklMgqNz56DAOro9FbiR3uYqssyt
YdC2q5R0dxg410jYruAonkbJq9jfYrTbRYclB9cTBEYxDb/SDBkaasT4alm3
MnrQx3RaoPJcVtJwruMYgl5tzQqJypO+4qA995xoBiPXv1b6rvO/+9qHrvRd
bzz0qBw2JzWRJy4cElqcKJX35q/SKU5eJ/VVHWIOXJ3H3zAKYHS+nHBA+AU/
cK+/tY9CDM+yeZSggJzH4MG44Cpl6HNIaHJL3xUkXk6ksy/rO0pb71lom0yk
ngjXAfNGXNTJESP7a4SycQkvKNjSg+Qrxnqovbhb6Uyc4vz4Vm+reYln6Wdz
kC5SszeDbEMn5d6ccD5W1kU2FY90LZjR6p9/7Jh/O39/ioDO9zOpnRcEWDrW
4MMrK4ztLGAZ0gedZkjiz2IFxOEUQ+0EmXCsvk3ZNbVgfVUBb4sR6FyuoXuB
3DEm9GA6u+Ken5C8jOWknoUiZmA51DpbrtYLgz6b3uXzYirsTRLMUFCxWCTa
YltS0GacHspQdkW91Tg8WxLrHSKhPZvTd2LxKHXByTBasC0qSBwaDPrnn/kE
89/bOxvP8Tc4LB3dvaBsYccEFVPDkElJFONZQ3rpwoDTkIefaGAnSsjMAyUM
bKosnXP7tiCYfcfHptJj3GlqTAADkAmElxD2l/T9FI2LAbEfBrXGb4ElSCkt
fEGTMhvf0cSg3yEHPx1HOl2lLL4sldjomG2IyOcmkkjUE3Ue1PyI/uHp8BNx
SwS70HrmtUhTWCo00NQXOudaM5r8K3U+/2CO9073mlEWELRMAonvjpsRruAp
ehyx9YiqwUBEfeBi0UKzsxnrJUK5Eo7uqyQvGLdjqeHko67vF80oxHxLHCfX
aqVJfvrp0yzvOimGkEn9WQA6CqHmUw0oQjyU74fMjAJj3ZuBptdc6BP3jpVZ
JlFjFcEVJn6wHtybbbRu3u5xcgg+0s9noPBsgcJ9MNpwJL7c8ox/bkW3BD1h
7nWsDR5rQz8/q93O8SoYl26MvrTO4ns/Rvx01V6ESHh6y3PMTBezJeDonna0
XVf0oPcG30vjan70+QNPNZW8ujcv8NwLvPcFHn7hHj6qPRS+Th97Ft0snOPy
qH8RrCW6Y5/1pctQk0ROBo2109vUO8/lxf3TSyXHlyymufueRfcNjvqXjH+X
or3htk3cthXdBt9G5bUMrJeMBpu8dX6aA3alXKIeZuXlmx1JqAjv1vz5yzO5
s4YIkvNcyXZGFgYv2d+HnLhWlGbY5rtCFLUcEcduoqdaeWLDgRdO6M3E1j5x
5NSOfjqz/RrUzg2O1zqeCmtmbtBWApRaTj7KJJbDFnZ6WKsZ+uGHbnjS78bZ
HcJLiVjsHbvYhoADa0ade2NesggqJuPC3CyJUCWWh7tu6gKkuFuPxK+X5vT9
hRRLBuXthVkjpfQcSlJzbtUeJyseTpZjVqmk8aDweNd+zgkRQuLESpR4uupW
nwuo8nkDiJRblZIwVtuOVa1C75lATzygKqzth9JLg3BdJ6ONt1QeoAnFjSQr
XTtbgwNg6OB2VXINKQ2QdvmENaVYKguyp8rWZbbFSkP9VTzY92zc1XhOWylE
6ZnVQlXoYgmG9kiFUu2oUC/aOJgXi2JYjJnt6EdRY0WtRsVuFvfq3eposMAc
xL0WC/jxci7p9+Gig0AL/fP88PRg7eyw/5ETWbtc/Fq620RrYomHNOwrVl41
t1zEFVaynGCg9afkSNLpdCJ1Xxb6QE+Z1uGAUFVbsNzHygbtD576msYx9SgQ
Lgdb+oHr28QzGjTvzguZdFOU5L2LpnCa+sjaxTgSQvvDRBGRGVBlT1s0NWr4
oXZfuhTwIMdf5ulm+LId0u01j0jnquzjEEIC5ZoncfEHURceLzURgcxus5x6
e26+VCstMFr4GfY2Vc0CDjF+82b43eaowPHK+flxGeKVE453zVnRP7wjhvbh
kLCXJi1lyQid4KUkmdWSt2z0nWuBFYQFBp4TnhzbCGjySo/dXHdkpjgJx/Yk
mFPBNi5y8vr4NMTTPQXFtXpMULctHUkMeq7VGu2RkvUXM6fs2KhJuFfy6V0x
vrNWCIsh/ARyukUGH0qBb5JTo+Y8FYxQnH27vGKFi7ZtzbyfA/cWNsuNVn/j
jBOnUtzdnhGrQcJtO1eVyfuFGSNviTX+HbeMXXyGI/UyQBoEGFWmRyKJ4sOJ
2qfPrAVXTAH+CxcoJ0F/lnexddbxpNsUKhrpZ3SE8hFvGOGsokMDyioH15g4
npsXVdywXeKHrvD2EyIIJBdtehByatb9HNuNDGWxKbUTWEannBwVGI9ycZmp
Yijc12UHFlfqXeej6xmUG79MVF336CgNCECqvwvmITa7BRcSSH2El7dzeyu0
VPf2tnlui9DAwBhqavOoS2XXLvUhVN1C1hGVXHC6emT7YD+mkAbRn8MRVGks
tdKb1u6Xu0lJxwYvJalHwOdZJB+7H8+OLw5JbPhw0Q4OLa31l19+SZBTu27q
/5raDjV1+nmmI2zQt8/Mlnluts0Ls2Ne/pprGONP3X/wv8RUe8UcsKFIU4D3
+mZvNEIoqWlJOut691m1vcD9P3Mm1dfra7e6z9v8/bk04o1mtt7daP/zZ1L5
1/Rimdsmz+13hIk5BLu7gFPklVn/sr6zvr62/mVn++BAYXbeH9AH0kN12r1e
7/eZSPTveHC3tUY/tmGAhTzY2lwHc95aN9p67XcECW/JfGgGYFqtDCbP2Urf
CJAQveSvXpmtFy83TEsOfvt3A8mHg4EWNQmXHH/fv82Gn0ribL8HSN7P+iTr
vgLxumTi1SJM2SNWe354fwI7ATHGi4/Z3M5p8DZb/Z5IEh7kHwbmVFw6rc0t
c5WHHeru9yA8aIzD74WtYiUx55B1wWrsbAbnpx1Tm9LvfG5+5b/75kHODi/e
7JqP+ZxL9juStN20mAcH+Q0z+d1gcpZNCpRqJrRsnV3SL+RqNS7m998dSAB6
mlsPTuL/k5l8/b/aFjuZbKDdNFscyUaiX8csZ1BSTi4+NBHpaCatT9lqjX2t
8B1o6wCxL5C4nTaB5f8HyMb/jvtnJGBsNXKrf/5MIDhC0SDpvovQJ9E2nJuQ
+dODcnTrPYmn5xXxtM2qRyEBn/QMeIyKKcz71PYH9V8ZpCp0QeUSrlHIVRni
eFOMBxKjYZ2HX9Sk6RV4YfptDiPPnTCPM5zcKVVKhSp19DoSVMUrjyE18MFm
kXj5W0RvJ3T3kqMmVaDUYAJ/Uq/zjD1f7AXWCdk+NWEgZcLlzklrdBYHGm7O
PjitME/nguelm0KqC3f3ZDz5TqojzivDlTbYgQjw232JQmyzt1StLeyFQtqZ
a9ggPQCQ04PZnR73OR01IZWjC42ki/R1lLt32kdHc9e5JwMzNOt4tMjifG7p
VCK2BJas4BxPJtmILRHs7G6JwAB5cp9zu/R1Yf32Igjx1xwc1vZgf3C2tn7Y
rIhoDSfxPm6Hb27abBOAvW3T2/2SMOppFHa54+Q8LYrJ5RdclfDYOyoxXg82
qRYYnWjJfNdmzJU/dHUOT8Xc2IKPERluRrpMnKyd2qcQ6GwzMZp7wbgiJbxR
FhpJMzQaey6nxtLyLqO4bZiMsnwIwEvEQkYPsLHeBZLA4FNpFdhSUzlMvk0t
A62jp12x1j9iln/KYs/+NZ+Rfs9tpLool+ebuW3QhY1t9zcvotIUOs2xno8K
CNOS2FSOM9aYbqyJW+mZ1oyIkbhkJukXsUCd36Yz7di4yNWu4+26rXI1uULH
xTYH36SI0G/1PxzsSbSKDEGSbWjHvocwCfe+FpiZutCVHVlO03fWNjpwlr57
c1G30LPdM7TjQ+iP/m7BcKlBDBixY9heqSZNoyZNmU8/tPZjaLqVLuMnwRGh
Av5r8Q0sii5i82R8GdXSy6i/a6lbigWvo6WLT4i/11YWXa7IGn4h/hsZ4Ggu
YvqKFzisNlVqzTy+NnzNYUNcI6o18njr4pguQLxhZP0zN+Dg293UcXWT23NU
rjsLIZqTecurb8zn+7sr9elXYjOIyqOiIjg22BmMbLSOUnyCdyCySMLEULEB
sHXybNccvuJykp9ebbp2rK/QjBjhnq+IMSlp2nA5uyivp9F2QSMyVwE4Aphb
DvsWzVN91Te5o7oR4j4vvuQTguwY4eN1yuLKVbL3A4to73IH5r/ROPjBHZX5
DzGlrKEbHegriVyvzC8vtpmXBue9Y37ZeVHBKl29NEbOkejz7QOETBDCTmeH
JrPztVPZ7G01zGXzRTQVorVc5nZ7w8NXG+NZv2K+sG1wJC1M0JEklF+erXND
mIDjclEya40tMWpTxWNtsKGPryAEYR9/2eSqV/5GxFxrowPBbplQuchmHa1M
wbmG4a6a56h3uFYG0Pb4lPgoCyvENYIdte9VbAAK7c2HtzmyXZB45SMcqxJD
EOMjKT8kbY2H1mRrX8gFRGHwpS9s8fOgsOnx4l9KIzIvPLnmX16s7wcR1YLT
6cIcDYg2swSAmqVoU/Zl8S9aoosFd89AG/+dM6ugDx9ZUbLWwkSkeJoeAqIF
G8J/7+yHHbUmD5Y2UD0NoIQ6EF73umUXE0pYdndEaUHpQRlGh3tzyWIMym6u
GbrxklHu1U4yoK3D88b1lI7m42pngh/jH0OACMKkhYR+nHwdurW91U76Aigr
yVf/ubqgWx1Iz8a8Rm2jhMQN6RTRDEpbN9EZznlvXtlzuZaJiyVJzglxu1er
Ln6T0uRwg74BUuAJ83//Hwxk/BIg4RMvkz/IFOmjvjRJ/GPYEr5pR+9mmOCD
LkdG2JSJmI1do0/uhJ6BV+7Txjaut96aP5mPhvayRJwq3S5o0dZhNnflTozT
MMxGZ5OHb9H3tEZGBfvos137PX1pN9A+urH9rLOzZR+NUcAOsLXr76Ob7K69
MtsvNjrrOzudbTeAHhFX6FAGeE4DhPfSjZs8wEbn2dZmZ+PFC56egBqnitiq
hHpno1ffXKfjMpMwoL3hpym6gYzEaUv70g9qT2tTbBJSss+iYXFfbE4CsRkn
+yc/vkbuKLsgkV6ZtK4mn2/+kmeL614xv2m7xO2recHG671jG7sqibwsiqMX
04UdjhSKXMh5skeEhgjcPJOSYfOoDynRk9vFYlburq3BQkJcdwjFyb56jTu2
rmE6a+kVyalrveT/Baw4ocxMHwEA

-->

</rfc>
