Internet-Draft IS-IS Aggregated SNP Hash Packets July 2026
Przygienda, et al. Expires 2 January 2027 [Page]
Workgroup:
Network Working Group
Internet-Draft:
draft-prz-lsr-ash-packets-00
Published:
Intended Status:
Experimental
Expires:
Authors:
T. Przygienda
HPE Juniper Networking
T. Li
HPE Juniper Networking
J. Halpern
HPE Juniper Networking

IS-IS Aggregated SNP Hash Packets

Abstract

The document presents an optional new type of database synchronization packet called an Aggregated SNP Hash (ASH). When feasible, it compresses traditional SNP exchanges into a dynamic Merkle tree-like structure, which speeds up synchronization of large databases and adjacency numbers while reducing the load from regular CSNP exchanges during normal operation. Just like CSNPs and PSNPs, ASH packets come in two flavors, called Complete ASH (CASH) and Partial ASH (PASH).

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 2 January 2027.

Table of Contents

1. Introduction

The document introduces an optional new type of database synchronization packet called the Aggregated SNP Hash (ASH). It compresses and enhances traditional CSNP and SNP exchanges with a structure somewhat similar to a dynamic Merkle tree [MERKLE], enabling faster synchronization of large databases and adjacency information while reducing the overhead of regular CSNP exchanges during steady state operation. By combining parallel flooding, SNP and ASH exchanges, database resynchronization can be accelerated because fewer packets (and not the entire CSNP set) are used to fix inconsistencies. Using ASHs also reduces unnecessary flooding and communication overhead, while still detecting mismatched fragments more efficiently than full CSNP exchanges.

To maintain the Merkle analogy, we initially treat each SNP entry for an LSP as equivalent to an "as good as perfect" - though rather long - Merkle hash. In other words, the LSP ID, sequence number, and checksum act as a "collision-free fragment hash". We include the fragment's PDU length in this perfect hash to add more entropy in the next step. Lifetime is not used since it is not compared except in purge cases. The SNP entries then serve as the conceptual "bottom" of the Merkle tree.

In the following step we compute a [SIPHASH] over the SNP entry data to create a shorter Merkle hash for each fragment. Somewhat surprisingly, these per-fragment hashes are never transmitted in packets, since doing so would effectively duplicate SNP functionality. However, the hash that aggregates all fragment hashes of a node becomes a "node Merkle hash". Groups of node hashes can then be summarized into a "node range Merkle hash" by combining the individual node hashes together. For the node and node range hashes, we switch from [SIPHASH] to a different hashing method, as explained in Section 4.2.

The resulting hierarchy of hashes enables large LSDB synchronizations using far fewer packets than CSNPs alone. Although the hashes summarize ranges recursively, the actual exchange resembles a [SKIP]-like process rather than maintaining a fixed tree structure, since the packets and ranges exchanged vary depending on where mismatches are found.

This document limits itself to LSDB sizes on the order of 1E6 fragments and addresses considerations necessary to prevent overly garrulous exchanges of hashes covering progressively smaller sets of fragments. More details on the targeted IS-IS scale envelope can be found in Section 9.1, and further considerations are summarized in Section 9.

2. Example

Before diving into precise technical details, an example will help introduce the concepts involved and demonstrate the behavior of an exchange.

The exchange consists of initial CASH packets (analogous to CSNPs but carrying node range hashes) followed by PASH packets (analogous to PSNPs) that omit matching ranges and exchange hashes over progressively smaller ranges. Once a hash covers a single node only, PSNPs or direct flooding is used to synchronize the databases in the traditional IS-IS manner. ASH operates at the resolution of a system ID (including its pseudonodes), not individual fragments.

The details of packet formats are provided later in Section 7 and Section 6, and the procedures are defined in Section 5.4, but the example can be understood intuitively without those details. For the sake of readability the example encompasses only a database of roughly 3000 fragments over 100 nodes with about 10% of the database not synchronized on start. Details of PSNP packets exchanged are omitted for the same reason. Note that the fragment counts shown in the examples are informational annotations only and are not carried in the wire format of ASH entries.

Initially, both nodes generate and exchange a CASH packet. Node 1 sends to Node 2 the CASH shown in Figure 1, containing node range hashes, some of which will disagree with the peer.

Node1->Node2 (initial CASH) -> 1 CASH packet(s)
Pkt   1 [1010.0000.00.00 - 1010.0063.00.00] (  35 hashes):
1010.0000.00.00 - 1010.0001.FF.FF:   47 frags, hash=$ED6174261B52A218
1010.0002.00.00 - 1010.0002.FF.FF:  150 frags, hash=$BA8EC0FA6F86557A
1010.0003.00.00 - 1010.0006.FF.FF:   87 frags, hash=$2A7D5B4045D54F14
1010.0007.00.00 - 1010.0008.FF.FF:   34 frags, hash=$B15068FC7C4F044B
1010.0009.00.00 - 1010.000A.FF.FF:   85 frags, hash=$4396112BEAB62D20
1010.000B.00.00 - 1010.000E.FF.FF:   88 frags, hash=$CCA15999A8BAA5E8  PEER MISMATCH
1010.000F.00.00 - 1010.0014.FF.FF:   91 frags, hash=$04CA3A21161464F9  PEER MISMATCH
1010.0015.00.00 - 1010.0017.FF.FF:   79 frags, hash=$82039367F94F611C  PEER MISMATCH
1010.0018.00.00 - 1010.001A.FF.FF:   61 frags, hash=$65D4C194A433D2A7
1010.001B.00.00 - 1010.001C.FF.FF:   92 frags, hash=$5F8701B08AE39C2C
1010.001D.00.00 - 1010.001F.FF.FF:   75 frags, hash=$0123DD370E42720C  PEER MISMATCH
1010.0020.00.00 - 1010.0021.FF.FF:   39 frags, hash=$9CFF63C97F8C86D0
1010.0022.00.00 - 1010.0022.FF.FF:  155 frags, hash=$D1513B30C1F1D21D
1010.0023.00.00 - 1010.0026.FF.FF:   91 frags, hash=$51B443F4FE1B20DA  PEER MISMATCH
1010.0027.00.00 - 1010.0029.FF.FF:   83 frags, hash=$44CEC7E1452B1855
1010.002A.00.00 - 1010.002E.FF.FF:   86 frags, hash=$8F1B907E974094F3
1010.002F.00.00 - 1010.0032.FF.FF:   89 frags, hash=$E2A31EC91DB1C4C3  PEER MISMATCH
1010.0033.00.00 - 1010.0036.FF.FF:   85 frags, hash=$E4D0AF41BA0BBDF1  PEER MISMATCH
1010.0037.00.00 - 1010.003B.FF.FF:   82 frags, hash=$13A40B0A950D7694
1010.003C.00.00 - 1010.003C.FF.FF:   23 frags, hash=$DA7335371FB231CD
1010.003D.00.00 - 1010.003D.FF.FF:   99 frags, hash=$D14266119B1003CD
1010.003E.00.00 - 1010.003E.FF.FF:   34 frags, hash=$91D8522BD2EB0523
1010.003F.00.00 - 1010.003F.FF.FF:   91 frags, hash=$9F206FAF8432046E
1010.0040.00.00 - 1010.0040.FF.FF:   88 frags, hash=$79441DDEE60A1F89
1010.0041.00.00 - 1010.0044.FF.FF:   87 frags, hash=$ED7F3899858D61EE
1010.0045.00.00 - 1010.0046.FF.FF:   22 frags, hash=$382CF99A031E42DE
1010.0047.00.00 - 1010.0047.FF.FF:  101 frags, hash=$10B9E7D282B7B8EF
1010.0048.00.00 - 1010.004B.FF.FF:   83 frags, hash=$AC0C721830474E1C
1010.004C.00.00 - 1010.004D.FF.FF:   25 frags, hash=$501B59AE9A65ABC9  PEER MISMATCH
1010.004E.00.00 - 1010.004E.FF.FF:  166 frags, hash=$4EF46668F1DD1A7B
1010.004F.00.00 - 1010.0054.FF.FF:   90 frags, hash=$64362751D60995DA
1010.0055.00.00 - 1010.0059.FF.FF:   87 frags, hash=$2771634B6A78AACE
1010.005A.00.00 - 1010.005C.FF.FF:   91 frags, hash=$E3F65D19CC13C7ED
1010.005D.00.00 - 1010.0060.FF.FF:   84 frags, hash=$C3205BB177BB816E
1010.0061.00.00 - 1010.0063.00.00:   52 frags, hash=$D3500D0875EC4B66
Figure 1

Analogously, Node 2 sends an initial CASH with 33 hashes as shown in Figure 2. It may seem counter-intuitive that the ranges do not align, but this is natural since the databases themselves differ. Section 3 addresses this effect in more detail.

Node2->Node1 (initial CASH) -> 1 CASH packet(s)
Pkt   1 [1010.0000.00.00 - 1010.0063.00.00] (  33 hashes):
1010.0000.00.00 - 1010.0001.FF.FF:   47 frags, hash=$ED6174261B52A218
1010.0002.00.00 - 1010.0002.FF.FF:  150 frags, hash=$B312CA61CDDAF08F
1010.0003.00.00 - 1010.0006.FF.FF:   87 frags, hash=$2A7D5B4045D54F14
1010.0007.00.00 - 1010.0008.FF.FF:   34 frags, hash=$B15068FC7C4F044B
1010.0009.00.00 - 1010.000A.FF.FF:   85 frags, hash=$134574F1DF91FBE5
1010.000B.00.00 - 1010.000E.FF.FF:   88 frags, hash=$41DA43916CC90C52  PEER MISMATCH
1010.000F.00.00 - 1010.0014.FF.FF:   91 frags, hash=$866969306096FD29  PEER MISMATCH
1010.0015.00.00 - 1010.0017.FF.FF:   79 frags, hash=$C2CEDFBA88080A3B  PEER MISMATCH
1010.0018.00.00 - 1010.001A.FF.FF:   61 frags, hash=$65D4C194A433D2A7
1010.001B.00.00 - 1010.001C.FF.FF:   92 frags, hash=$0BA9276510EE03CB
1010.001D.00.00 - 1010.0021.FF.FF:   92 frags, hash=$4F5A632B673697BA  PEER MISMATCH
1010.0022.00.00 - 1010.0022.FF.FF:  155 frags, hash=$6EDDFD41D8C4E0A6
1010.0023.00.00 - 1010.0026.FF.FF:   91 frags, hash=$2C2B89EF96A7C7C8  PEER MISMATCH
1010.0027.00.00 - 1010.0029.FF.FF:   83 frags, hash=$878C86D658B81FAD
1010.002A.00.00 - 1010.002E.FF.FF:   86 frags, hash=$8F1B907E974094F3
1010.002F.00.00 - 1010.0034.FF.FF:   81 frags, hash=$2684E1CAA3FB8EB7  PEER MISMATCH
1010.0035.00.00 - 1010.0038.FF.FF:   90 frags, hash=$6D1FAC454620D6BB
1010.0039.00.00 - 1010.003C.FF.FF:   67 frags, hash=$ADB300A6365F6360
1010.003D.00.00 - 1010.003D.FF.FF:   99 frags, hash=$CDF517B289695626
1010.003E.00.00 - 1010.003E.FF.FF:   34 frags, hash=$0365ED0461CA3EDC
1010.003F.00.00 - 1010.003F.FF.FF:   91 frags, hash=$1EA9199FB3A12017
1010.0040.00.00 - 1010.0040.FF.FF:   88 frags, hash=$761D4D297B4BFA45
1010.0041.00.00 - 1010.0044.FF.FF:   87 frags, hash=$1C97513E233AD431
1010.0045.00.00 - 1010.0046.FF.FF:   22 frags, hash=$382CF99A031E42DE
1010.0047.00.00 - 1010.0047.FF.FF:  101 frags, hash=$022CA678714C04D5
1010.0048.00.00 - 1010.004B.FF.FF:   83 frags, hash=$AC0C721830474E1C
1010.004C.00.00 - 1010.004D.FF.FF:   25 frags, hash=$AA9C89D8384E152E  PEER MISMATCH
1010.004E.00.00 - 1010.004E.FF.FF:  166 frags, hash=$D82B8A40F2F6B75D
1010.004F.00.00 - 1010.0054.FF.FF:   90 frags, hash=$64362751D60995DA
1010.0055.00.00 - 1010.0059.FF.FF:   87 frags, hash=$2771634B6A78AACE
1010.005A.00.00 - 1010.005C.FF.FF:   91 frags, hash=$10DAD5C8C452DDCB
1010.005D.00.00 - 1010.0060.FF.FF:   84 frags, hash=$9316E89831EE599B
1010.0061.00.00 - 1010.0063.00.00:   52 frags, hash=$D3500D0875EC4B66
Figure 2

At this point both nodes take the range mismatches they detected and split them into smaller ranges. 13 of those splits yield a single system ID as range and hence trigger CSNP, PSNP, or full flooding to synchronize the database. This synchronization follows standard IS-IS behavior and proceeds in parallel, independent of any further ASH exchanges. In this example, one additional packet is sent with ranges large enough to warrant ASH, as shown in Figure 3.

Node1 -> Node2 (after cuts) -> 1 PASH packet(s)
1010.001D.00.00 - 1010.001F.00.00:   75 frags, hash=$98997F9A286EA1FD  PEER MISMATCH
1010.0020.00.00 - 1010.0021.FF.FF:   39 frags, hash=$9CFF63C97F8C86D0
1010.002F.00.00 - 1010.0031.00.00:   61 frags, hash=$498F595B45397172  PEER MISMATCH
1010.0032.00.00 - 1010.0034.FF.FF:   61 frags, hash=$17E6C25A2881B26E  PEER MISMATCH
Figure 3

Splitting the remaining mismatched ranges on Node 2 will lead to CSNP, PSNP, or full flooding for the affected nodes, since no ranges large enough to warrant further ASH exchange remain.

Assuming a PSNP strategy for single-system-ID mismatches, the example generates 7 PSNPs from both sides (since PSNPs MUST cover all fragments of a system ID that ASH discovered to be mismatched), plus 2 CASHes and 3 PASHes. This compares favorably to the 62 CSNPs that would normally be required on startup.

3. Node Ranges

To discuss node range hashes meaningfully, we refer to hashes covering a wider range of nodes as "less specific" and those covering only a subset of that range as "more specific".

When nodes advertise node range hashes to each other, it would obviously be useful for cache-based implementations to agree on which ranges to use, as this would make caching much more effective. However, while this seems viable in theory, implementing it across many interfaces would effectively require a global synchronization protocol - something impractical in a network where nodes constantly add, remove, and update fragments asynchronously. These ongoing changes continually shift what the "optimal" hash ranges are, and with enough churn, such a range negotiation protocol might never converge.

Alternatively, providing a fast way to reconstruct the Merkle hash for a "mismatched" range reduces the need for perfect range alignment. Since hashes are built on system ID boundaries, maintaining a Merkle hash per system ID allows a node to quickly recompute the required hashes whenever received ranges differ from cached ones.

As a second consideration, even though a fully stable network could in theory be represented by a single hash covering the entire LSDB, doing so is neither desirable nor beneficial. Since an ASH packet must be sent anyway, it is much better to fill it with as many range hashes as possible. This limits the work required when a collision occurs within one of the ranges and also reduces the risk of hash collisions, as discussed in Section 9.3.

In summary, a node should avoid compressing beyond the point where a single ASH covers the entire database. Ideally, one ASH should contain at most about (MTU / Node Range Hash Entry Size) hashes to keep collision probabilities low, as described in Section 9.3. Due to the efficiency of recomputing node range hashes on the fly, there is no need to negotiate or agree on ranges with peers.

4. Hash Functions

4.1. Hash Function for a Fragment

Each fragment generates a 64-bit siphash-1-3 [SIPHASH] from its full SNP description. The salt key is given in Appendix A.

If a fragment hash computes to zero, the value MUST be replaced with the constant 1.

To validate implementation correctness, a reference hash is provided in Figure 4.

Hash Variant: Siphash-1-3:64Bits
Fragment ID: 0101:0101:0000.01-01
Seq# $0001
Csum: $0001
Len: 0512
Hash $6EB348F808C9AE4E
Figure 4

4.2. Fast, Incremental, Self-Inverse Hashing Function for Ranges

Since large-scale deployments must compute significant numbers of hashes over sets of frequently changing fragments, it is highly desirable to use a specialized hash function that supports fast incremental updates when fragments are added, removed, or when their checksums change.

Once hashes are built over sets of fragments, it is desirable to support very fast splitting and merging of such sets, especially when two hashes differ in which fragments they contain.

Further discussion of such hashes can be found in [HASHES], but the design space is simplified here since cryptographic security is not a concern.

The hash for a set of fragments is computed by XORing their individual fragment hashes. This makes it straightforward and very fast to update the hash when a fragment is added, removed, or its checksum changes. As a result, less specific ranges can quickly derive their hash by XORing the hashes of all included more specific ranges, when those are available.

If a node range hash computes to zero, the value MUST be replaced with the constant 1; otherwise the range would assume the semantics of "no fragments present for those nodes".

5. Procedures

5.1. ASH Support Negotiation

IIHs of nodes supporting this extension MUST include a new TLV indicating support for reception of ASHes. All nodes on the adjacency MUST advertise this TLV in their IIHs; otherwise ASHes are not used on that adjacency. Note that a node may choose to only receive and process ASHes while always responding with CSNPs or PSNPs, although this is obviously less beneficial than fully supporting both sending and receiving of ASHes.

The ASH Capability TLV is a zero-length TLV (type TBD) included in IIH PDUs. Its presence indicates that the advertising node is capable of receiving and processing ASH packets. No value fields are defined at this time; future extensions may add capability flags within this TLV.

5.2. Advertising and Receiving CASHes

Advertising of standard CSNPs is extended or replaced with CASH advertisements when this feature is supported. Analogous to CSNPs, CASHes are sent periodically at the same interval as configured for CSNP advertisement. Since both CSNPs and CASHes carry range information in their headers, they can be freely mixed, depending on which level of fragment "compression" best fits the situation. In practice, sufficiently specific range mismatches will naturally fall back to SNP exchanges or flooding to resolve remaining differences.

5.3. Advertising and Receiving PASHes

Where PSNPs are used, PASHes may extend or substitute the exchange as long as the rules of Section 5.4 are followed.

5.4. Refinement Rules

ASH exchanges are per-level, analogous to CSNP/PSNP exchanges. CASH and PASH packets are scoped to the IS-IS level (Level 1 or Level 2) of the adjacency on which they are sent, just as CSNPs are.

Fragments in purge state (remaining lifetime of zero) are NOT included in ASH hash computations, consistent with the treatment of purges in CSNP exchanges. Purges are handled through normal IS-IS flooding procedures independently of ASH.

When a node receives an ASH where any of the contained hashes does not match after recomputation or comparison, it MUST send PASHes with Merkle hashes covering the mismatched ranges. These new hashes MUST be more specific than the range where the mismatch occurred.

Alternatively, instead of more specific PASH hashes, a node MAY choose to send corresponding SNPs directly. Sending SNPs immediately may be preferable when the mismatch affects only a small number of LSPs.

In case the mismatched hash has no matching fragments in the local database, the node MUST fall back to CSNPs or CASHes, or send a PASH with hash set to zero to request the remote node to either refine the range or use normal IS-IS procedures to synchronize the database. The semantics of a zero hash are consistent in this regard: zero indicates that a node is not present in the system or has no fragments, which for ASH purposes is a distinction without a difference. A zero hash consistently indicates that the range is not covered by ASH compression and MUST be resolved through SNP exchanges or flooding, regardless of whether fragments are actually present.

If there is a mismatch - or no computation available - for a hash covering just a single system ID (including its pseudonodes), then SNPs for all fragments of that node and its pseudonodes MUST be sent, or the fragments flooded directly. Since this behavior is initiated from both sides of the adjacency, both nodes will ultimately end up with a full set of fragments belonging to the node described by the hash, even when using PSNPs. Other techniques MAY also be used, such as sending a CSNP covering the range from the smallest possible fragment of the system ID to the largest possible fragment.

6. CASH PDU Format

The CASH PDU format closely follows the CSNP format. Instead of CSNP entries, it carries the corresponding Merkle hashes - which cover exactly the same fragments that would appear in CSNP packets. The Start and End System IDs exclude pseudonode bytes, as those are implicitly included within the ranges.

                                       No. of Octets
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Intradomain Routing Protocol        |     1
| Discriminator                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length Indicator                    |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version/Protocol ID Extension (1)  |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID Length                           |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| R R R   PDU Type                    |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version (1)                         |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved                            |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Max Area Addresses                  |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PDU Length                          |     2
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source ID                           |  ID Length + 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Start System ID                     |  ID Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| End System ID                       |  ID Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node Range Hash Entries             |  Variable
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Start and End System IDs use the standard ID length and indicate the range of fragments covered by the ASH, just like CSNPs do. The key difference is that all pseudonodes of the systems within this range are implicitly included (implying as well that all fragments of the range are included). Both the Start and End System IDs are inclusive, meaning fragments from both endpoints are part of the range. All Node Range Hash Entries MUST fall within the bounds defined by the header's Start and End System IDs.

The variable length fields are a sorted sequence of Node Range Hash Entries in the following format.

                                       No. of Octets
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Range Start System ID               |  ID Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Range End System ID                 |  ID Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Merkle Hash                         |     8
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The range of fragments included in the hash. Range Start and Range End System IDs are inclusive, i.e. fragments of both the start and the end system ID are contained within the range.

The Merkle hash field is 8 octets containing the 64-bit computed hash of all fragments covered by the range.

With the standard ID Length of 6, each entry is 6 + 6 + 8 = 20 octets, so about 70 entries fit into a standard 1500-byte MTU.

The ranges in a CASH MUST be sorted by Range Start System ID. The ranges MUST NOT overlap. A receiver encountering overlapping ranges MUST treat the union of those ranges as a single range with a zero hash. Any range exceeding the CASH header bounds MUST be clamped to fit within the CASH range and treated as a zero hash. Any range whose end is equal to or smaller than its start MUST be discarded. A node SHOULD log such events since they most likely indicate protocol implementation problems or attacks.

Any node IDs that are not covered by the ranges in a CASH - either because there are gaps between the advertised ranges, or between those ranges and the ASH's Start and End System IDs - MUST be treated as missing. Consequently, if a node detects that it holds Merkle hashes for LSPs that are not covered by a received CASH, it MUST behave as it would in the same situation with a CSNP, namely by flooding the missing LSPs.

As with CSNPs, a CASH whose first range covers the first node in the database MUST use 0000.0000.0000 as the start system ID in its packet header so that missing nodes can be detected. The same rule applies at the other end: a CASH whose range covers the last node MUST indicate this so that any missing trailing nodes are detectable.

7. PASH PDU Format

The PASH PDU format closely follows the CASH format while, analogous to PSNP, omits the range of the packet.

                                       No. of Octets
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Intradomain Routing Protocol        |     1
| Discriminator                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length Indicator                    |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version/Protocol ID Extension (1)  |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID Length                           |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| R R R   PDU Type                    |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version (1)                         |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Reserved                            |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Max Area Addresses                  |     1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PDU Length                          |     2
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source ID                           |  ID Length + 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Node Range Hash Entries             |  Variable
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Variable Length Fields carry the same Node Range Hash Entry format as in CASHes.

PASHes are equivalent to PSNPs in that entries are treated independently: gaps between entries have no semantics, entries need not be sorted, and may overlap. Any range whose end is equal to or smaller than its start MUST be discarded. A node SHOULD log such events since they most likely indicate protocol implementation problems or attacks.

8. CASH Example

A more detailed example will clarify CASHes further. Consider an LSDB with nodes having system IDs of 1000.0000.00<2 digits node-id>, with node-id in the range 0 - 512. Each node holds 32 fragments numbered 0 - 31. We assume only nodes with even identifiers are present, intentionally creating "holes" in the numbering and leaving us with 257 nodes. The pseudonode byte is treated simply as part of the system ID since it does not affect the scheme.

In a stable state, reasonable compression can deliver 129 "first-order" leaves - each containing fragments from 2 systems (64 fragments total) - requiring roughly 129 / 70 ~ 2 packets. The first of these "first-order" packets would look approximately like this:

        ...

+--------------------------------------------+
|  Start System ID: 0000.0000.0000           |
+--------------------------------------------+
|  End System ID:   0000.0000.00A0           | // 80 ranges covering 160 nodes
+--------------------------------------------+
+--------------------------------------------+
|  Start System ID: 1000.0000.0000           |
+--------------------------------------------+
|  End System ID:   1000.0000.0002           | // 64 fragments over 2 systems
+--------------------------------------------+
|              Merkle Hash                   |
+--------------------------------------------+
..
|  Start System ID: 1000.0000.008E           |
+--------------------------------------------+
|  End System ID:   1000.0000.00A0           |
+--------------------------------------------+
|              Merkle Hash                   | // 64 fragments over 2 systems
+--------------------------------------------+
Figure 5

Based on local decisions, a node can further compress ASHs until - in the most extreme case - it sends just one packet full of hashes. In our example with 256 nodes, this divides them across 70 hashes (assuming equal fragment counts for simplicity), resulting in about 4 nodes per hash (equivalent to 4 * 32 fragments). The resulting packet would look like this:

        ...

+--------------------------------------------+
|  Start System ID: 0000.0000.0000           |
+--------------------------------------------+
|  End System ID:   FFFF.FFFF.FFFF           |
+--------------------------------------------+
|  Start System ID: 1000.0000.0000           |
+--------------------------------------------+
|  End System ID:   1000.0000.0010           |
+--------------------------------------------+
|              Merkle Hash                   |
...
+--------------------------------------------+
|  Start System ID: 1000.0000.01F0           |
+--------------------------------------------+
|  End System ID:   1000.0000.0200           |
+--------------------------------------------+
|              Merkle Hash                   |
+--------------------------------------------+
Figure 6

9. Further Considerations

9.1. IS-IS Scale Envelope

ASHs provide negligible benefit in small networks with only tens of nodes and hundreds of fragments. Perfect flooding would theoretically suffice at any scale (though history shows this is too optimistic), and CSNPs under such assumptions represent protocol overhead only - yet they have significantly contributed to IS-IS stability in real deployments.

As networks grow larger, the costs of link flaps, node restarts, and periodic CSNP exchanges increase substantially. ASHs can significantly extend IS-IS scalability in these scenarios.

To push IS-IS to its practical limits, we must account for flooding rates driven by fragment refreshes and LSDB synchronization. Path computation impacts can be deferred - barring pathological scenarios - through dampening and parallelization, so we focus on realistic scenarios on the order of 50,000 nodes and 1 million fragments. We assume an outer envelope of 16K peers, which provides some margin over the largest deployments realized today.

At maximum configured lifetime of 65K seconds, this generates ~15 packets/second per interface from refreshes (1M/65K fragments/second), plus periodically ~10,000 CSNP packets for full LSDB sync.

Achieving sync within a modest 120 seconds requires ~80 CSNPs/second, and across 16,000 interfaces this represents a peak of 1.5 million packets/second - far beyond current fast-flooding capabilities. We disregard here techniques like [ID.draft-ietf-lsr-distoptflood-11], especially since they do not improve CSNP scale.

With maximum ASH compression, however, sync overhead drops to roughly one additional packet beyond refresh flooding, leaving ~250,000 packets/second (15 * 16K) to be handled by fast flooding and flood reduction techniques - a challenging but feasible target.

The above makes it clear that combining fast flooding, flood reduction, and ASH will be essential for extending IS-IS scalability as deployments continue to grow.

9.2. Maximum Advisable Hash Coverage

Although the mechanism can theoretically use a single Merkle hash to represent an arbitrarily large database, such an approach is not advisable. Instead, it is far preferable to limit hash coverage so that at minimum one full ASH packet is required.

In practice, limiting compression so that a maximum of about a dozen ASH packets covers the entire database is usually sufficient. For example, a single maximally compressed ASH packet for a 10,000 - fragment database covers ~140 fragments per hash. Allowing for more ASH packets (e.g., 10 instead of 100 CSNPs) still provides a 10x compression factor, reduces disaggregation needs during LSDB changes, and further lowers the already negligible collision risk (which in 10K sized LSDB is however vanishingly small with a single hash already).

9.3. Hash Collision Probabilities

Even with CSNPs or PSNPs, IS-IS has a corner case where LSDB synchronization can fail - particularly during node restarts. In simple terms, if a new fragment has the same sequence number and different content but an identical 16-bit Fletcher checksum, the collision goes undetected until the fragment expires.

Assuming Fletcher checksums are uniformly distributed (even with minor content changes), the collision probability for that case is 1/2**16 ~ 0.0015%. This baseline enables meaningful comparisons with ASH Fletcher collision probabilities.

ASH uses 64-bit [SIPHASH] over what is essentially PSNP data for a fragment plus its IS-IS PDU length. The key concern is the probability that two fragments covered by the same hash produce identical hashes simultaneously. This would cause both fragments to "cancel out" due to the XOR nature. Collisions between fragments in different node range hashes are irrelevant.

One might argue that XORing different sets of hashes could produce the same result, but the probability of two distinct sets having identical modulo-1 sums across all 64 bits is vanishingly small. This scenario is not considered further.

Collision probability analysis is complex for the general case, though the birthday paradox gives a rough estimate of 0.18% collision likelihood for 48-bit hashes in a 1M-sized set. For 64-bit hashes this reduces to ~0.000,002,7%. Rather than relying solely on statistical assumptions, we use extensive simulations that mirror real-world conditions.

These simulations model 50,000-node networks with 1M fragments, assuming node IDs differ by only 3 bytes, maximum fragment lifetimes, random protocol checksums on fragment refresh, and packet length changes in just 5% of refreshes to reflect network stability. Results are derived from 32 networks running for 2 years each.

Across 64 years of simulated time and the resulting 36E9 fragment refreshes, we observed 142 collisions for the 48-bit variant of [SIPHASH] across the whole set, or roughly ~0.000,000,4% - much lower than the birthday paradox prediction. However, assuming a single maximum-size ASH packet covering the entire database, only 3 of those collisions matter - a probability of ~0.000,000,008%, or roughly 1 occurrence per 20 years. These collisions have an average lifetime of about 10 hours. The rates are orders of magnitude lower than birthday paradox predictions, likely because node IDs act as a consistent "salt" that effectively pre-partitions the probability space. This is arguably "good enough" by a wide margin.

Nevertheless, further investigation using the standard 64-bit SipHash 1-3 variant over the same scenario produces *no* detectable collisions. Measuring the CPU cost of the 48-bit variant vs. the 64-bit variant (or even 64-bit traditional Fletcher checksum) shows negligible differences of low single-digit percent with modern implementation techniques. Thus, 64-bit [SIPHASH] 1-3 has been chosen and should provide a very safe margin even for much larger databases.

Ultimately, a highly conservative (not to say paranoid) implementation can simply monitor the LSDB for colliding fragment hashes and exclude them from the same ASH hash. This forces receiving nodes to use separate collision-free hashes instead. Such an approach completely eliminates any risk of synchronization misses when using ASHs.

Other techniques are possible, such as slowly walking the database and sending CSNPs. However, for a 1M-fragment database that generates 10,000 such CSNP packets, the chance of this detecting a collision during its 10-hour window is rather unlikely.

9.4. Impact of Packet Losses

Hashes covering large numbers of fragments are more vulnerable to packet losses, as each lost packet affects a much larger portion of the LSDB during synchronization. Implementations can choose ASH node ranges freely, but should balance maximum compression against "good enough" compression that reduces both collision risk and vulnerability to possible packet drops.

9.5. Decompression and Caching/Comparison Optimizations

As mentioned earlier, nodes can use various strategies to accelerate decompression. For example, LSPs missing from CASHes (those not covered by any ranges) are clearly absent and can be immediately reflooded. Similarly, small mismatched ranges can trigger immediate CSNPs, PSNPs, or direct flooding.

Caching of hashes can be applied at many levels. The simplest and most useful approach is maintaining a hash for all fragments of a node and its pseudonodes, though other granularities work equally well. Even when adjusting for changes - such as receiving a range < A - B > while having cached < A - B & next-after-B > - the cached hash can be quickly updated by simply XORing out the next-after-B node Merkle hash.

10. Security Considerations

ASH security relies on the same mechanisms protecting IS-IS PDU integrity.

11. IANA Section

TBD

12. Contributors

TBD

13. Acknowledgement

People have been talking about "compressing CSNPs" for a very long time, reportedly going back to when Radia Perlman and an insomniac Dave Katz were walking the halls discussing it. Recent attempts to scale the protocol much further have made it worthwhile to turn this idea into a standardized, practical engineering solution.

Les Ginsberg identified several unresolved issues and contributed alternative ideas to the draft.

Job Snijders initiated the discussion of [SIPHASH] as a likely better solution than traditional Fletcher checksumming of fragments.

Claude (Anthropic) [CLAUDE] assisted in reviewing the draft specification, identifying ambiguities in zero-hash semantics, overlap handling, and PDU header-to-entry range relationships.

14. Informative References

[CLAUDE]
Anthropic, "Claude: AI Assistant", , <https://www.anthropic.com/claude>.
[HASHES]
Phan, C.-W., "Security considerations for incremental hash functions based on pair block chaining", .
[ID.draft-ietf-lsr-distoptflood-11]
White et al., R., "IS-IS Distributed Flooding Reduction", , <https://www.ietf.org/id/draft-ietf-lsr-distoptflood-11.txt>.
[MERKLE]
Merkle, R.C., "A Digital Signature Based on a Conventional Encryption Function", .
[SIPHASH]
Aumasson, J.-P. and D. J. Bernstein, "SipHash: A Fast Short-Input PRF", Lecture Notes in Computer Science Vol. 7668, INDOCRYPT 2012, pp. 489-508, , <https://131002.net/siphash/>.
[SKIP]
Pugh, W., "Skip lists: A probabilistic alternative to balanced trees", .

Appendix A. Reference Implementation of SIP Fragment Hashing

<CODE BEGINS>
pub fn fragment_hash(
    fragmentid: &SharedFragmentID,
    fragmentcontent: &FragmentContent,
    variant: Option<ASHFragmentHashVariant>,
    size: Option<ASHSize>,
) -> ASHHash {
    let nid = fragmentid.node.node_id().0;
    let pnodebe = fragmentid.pnode.0.to_be_bytes();
    let seqnrbe = fragmentcontent.seqnr.0.to_be_bytes();
    let fragmentnrbe = fragmentid.fragmentnr.0.to_be_bytes();
    let csumbe = fragmentcontent.IS-IS_checksum.0.to_be_bytes();
    let lenbe = fragmentcontent.IS-IS_pdu_length.0.to_be_bytes();

    let mut rotate_in_primary = nid.iter().chain(
        csumbe.iter().chain(
            seqnrbe.iter().chain(
                fragmentnrbe
                    .iter()
                    .chain(lenbe.iter().chain(pnodebe.iter())),
            ),
        ),
    );

    let size = size.unwrap_or(ASHSize::LIBRARY_ASH_SIZE);
    let variant = variant.unwrap_or(ASHFragmentHashVariant::LIBRARY_ASH_FRAGMENT_HASH);

    match variant {
        ASHFragmentHashVariant::Siphash => {

            let key = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16u8];
            let mut hasher = SipHasher13::new_with_key(&key);
            let mut sl = [0u8; 16];
            rotate_in_primary
                .map(|v| *v)
                .collect_slice_checked(&mut sl[..]);
            hasher.write(&sl);
            let r = match hasher.finish() {
                v if v == 0 => 1,
                v if v != 0 => v,
                _ => unreachable!(),
            };
            match size {
                ASHSize::_64Bits => {
                    r.into()
                },
                ASHSize::_48Bits => {
                    let hin = r ^ (r >> 48);
                    (hin & 0xffff_ffff_ffff).into()
                }

<CODE ENDS>

Authors' Addresses

Tony Przygienda
HPE Juniper Networking
Tony Li
HPE Juniper Networking
Joel Halpern
HPE Juniper Networking