Internet-Draft YANG Message Keys July 2026
Graf, et al. Expires 3 January 2027 [Page]
Workgroup:
NMOP
Internet-Draft:
draft-ietf-nmop-yang-message-broker-message-key-02
Published:
Intended Status:
Standards Track
Expires:
Authors:
TG. Graf
Swisscom
AE. Elhassany
Swisscom
AHF. Huang Feng
Deutsche Telekom
BC. Claise
Everything OPS
PL. Lucente
NTT

YANG Message Keys for Message Broker Integration

Abstract

This document specifies a mechanism to define a unique Message key for a YANG to Message Broker integration and a topic addressing scheme based on YANG-Push subscription type and YANG Schema Node Identifier. This enables YANG data consumption of a subset of subscribed YANG data, either per specific YANG data node, identifier or telemetry message type, by indexing and organizing in Message Broker topics. It helps to index the information by using data taxonomy and organizes data in partitions and shards of Message Brokers and time series databases.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 3 January 2027.

Table of Contents

1. Introduction

Nowadays network operators are using machine and human readable YANG [RFC7950] to model their configurations and monitor YANG operational data from their networks according to [Mar24].

Most network analytic use cases require real-time data and the delivery of near real-time analytical and actionable insights. This imposes high scalability, resilience and low overhead in the data processing pipeline. Accessing the right data for the right use case with minimal overhead and in the shortest period of time is therefore crucial.

Network operators organize their data in a Data Mesh [Deh22] according to [Bod24] where a Message Broker, such as Apache Kafka [Kaf11] or Apache Pulsar [Pul16], facilitates the exchange of Messages among data processing components in topics and subjects. Typically, data is being stored in Message Broker topics for several hours or days to facilitate resilience in the data processing chain and addressed in Subjects depending on Schema, enabling a data consumer to address and re-consume previously consumed data again if previously lost.

Dimensional data is structured information in a data store. It uses a model of dimension tables to organize business metrics and their descriptive context. This model, developed by Ralph Kimball [Kim96], simplifies data analysis and reporting by creating denormalized, easy-to-understand structures for quick querying. It is optimized for online analytical processing (OLAP) and data warehouses by using the data taxonomy to scale in partitions and shards. YANG [RFC7950] as a data modelling language based on hierarchical tree-based structures facilitates the modelling of dimensional data. This is best shown with YANG Tree Diagrams [RFC8340].

An Architecture for YANG-Push to Message Broker Integration [I-D.ietf-nmop-yang-message-broker-integration] specifies an architecture for integrating YANG-Push with Message Brokers for a Data Mesh architecture. Section 4.5 of [I-D.ietf-nmop-yang-message-broker-integration] describes how the notification messages at a YANG-Push Receiver are being transformed to the Message Broker while Section 3 of [I-D.ietf-nmop-message-broker-telemetry-message] specifies to a Message Schema to contextualize telemetry data. However, neither of these documents addresses how these messages should be indexed in a Message Broker, nor define how topics, partitioning and sharding must be used.

Due to this missing dimensional indexing for Message Broker stored YANG data, all YANG data is stored in one single Topic. This leads to a round robin distribution across multiple Partitions where each YANG Schema ID is defined as a subject within that topic. Therefore, the entire Topic from all Partitions needs to be consumed first before data selection can be applied. This leads to avoidable data processing overhead which in turn impairs scalability and real-time capabilities, required for certain Network Analytics use cases.

YANG telemetry data can be used for several network analytic use cases. Importantly, depending on the use case, only a subset of the subscribed YANG data might be necessary (in time or space). For example, for specific use cases, it is more important to know the current network state, as opposed to have the full series of the state changes over time. In other use cases, instead of consuming data for all network nodes, only a specific network node or network node component requires the YANG monitoring and hence subscription.

This document defines how YANG Messages [I-D.ietf-nmop-message-broker-telemetry-message] should be indexed and organized in Message Broker topics by leveraging the network node hostname, the YANG-Push subscription identifier, and concrete XPath data node instances derived from the YANG schema path for indexing. Then, a YANG-Push subscription type and YANG Schema name for a Message Broker topic naming scheme is defined to better organize YANG data.

Network node hostname and subtree and xpath filters are part of "ietf-yang-push-telemetry-message" structured YANG data defined in Section 3 of [I-D.ietf-nmop-message-broker-telemetry-message]. The Message Key is derived through a three-phase algorithm that normalizes subscription filters against the YANG schema path and extracts concrete data node instances from each notification message.

2. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2.1. Terminology

The following terms are used as defined in [I-D.ietf-nmop-terminology]:

  • Network Telemetry
  • Network Analytics
  • Value
  • State
  • Change

The following terms are used as defined in [I-D.ietf-nmop-yang-message-broker-integration]:

  • Message Broker
  • YANG Message Broker Producer
  • YANG Message Broker Consumer
  • YANG Schema Registry
  • YANG Schema ID
  • YANG Data Consumer

The following terms are used as defined in Apache Kafka [Kaf11] and Apache Pulsar [Pul16] Message Broker:

  • Subject: Corresponds to a unique Schema Path within a Schema Registry and is used to identify Messages within a Topic.
  • Topic: A communication channel for publishing and subscribing messages with one or more subjects and partitions.
  • Topic Compaction: The act of compressing messages in a topic to the latest state. As used with Apache Pulsar. Apache Kafka uses the term Log Compaction with identical meaning.
  • Partition: Messages in a topic are spread over hash buckets where a hash bucket refers to a partition being stored within one message broker node. Message ordering is guaranteed within a partition.
  • Shard: The same as Partition but distributed among multiple message broker nodes. In this document, the term Partition is being used primarily but the described indexing concept equally applies also to Shards.
  • Message: A piece of structured data sent between data processing components to facilitate communication in a distributed system
  • Message Key: Message Key: Metadata associated with a message to facilitate deterministic hash bucketing and indexing for instantiated YANG data.

The following terms are used as defined in The Log-Structured Merge-Tree [One96] scientific paper:

  • LSM Tree: Log-Structured Merge-Tree is a data structure with performance characteristics that makes it attractive for providing indexed access to files with high insert volume. LSM trees, like other search trees, maintain key-value pairs.

The following terms are used as defined in Confluent Schema Registry Documentation [ConDoc18]:

  • Schema: A formalized, documented structure that defines the shape and content of the messages exchange.
  • Schema ID: A unique identifier of a schema associated to a Message Broker subject.
  • Schema Registry: A system where schemas are registered, compared and retrieved.

The following terms are used as defined in [RFC8641]:

  • Periodic Subscription
  • On-change Subscription
  • Sync-On-Start
  • Xpath Filter
  • Subtree Filter

The following terms are used as defined in [I-D.ietf-netconf-notif-envelope]:

  • Notification
  • Hostname

The following terms are used as defined in [RFC8342]:

  • Datastore

The following terms are used as defined in [RFC7950]:

  • Schema Node Identifier
  • Data Node: Such as "container", "leaf", "leaf-list", "list", "choice", "case", "augment", "uses", "anydata", and "anyxml" elements.

The following terms are used as defined in this document:

  • Schema Path: A specific route to a single, leaf in a schema tree. As oposed to a schema tree which represents the entire hierarchical structure of a schema, showing how all these individual paths branch out and relate to each other as a whole.

3. Solution Design

To identify which network node produced which YANG data instance into which Message Broker Topic, Partition and Subject, YANG Message Keys and Indexes (Section 3.1) are being introduced. These keys enable a deterministic distribution of YANG messages across Topics and Partitions enabling applications to consume only the needed data from specific topics and partitions.

In order to facilitate Message Broker Topic Compaction, a YANG-Push subscription type based topic naming scheme (Section 3.2) is defined. This segregates statistical (Value), State and State change YANG metrics and facilitates a YANG Message Broker Consumer to use the Topic wild card consumption method to select based on YANG-Push subscription type.

3.1. YANG Message Keys and Indexes

For topics that carry YANG telemetry messages as defined in [I-D.ietf-nmop-message-broker-telemetry-message], a Message Key MUST be used. If no Message Key is defined then the Messages are distributed in a round robin fashion across partitions. If a Message Key is defined, then the value of the Message Key is being used as input for the Message Broker Producer hash function to distribute across Partitions. Therefore, Message Keys facilitate Message deterministic distribution.

The Message Key is not only used for Message indexing at the Message Producer but also at the Message Broker for topic compaction.

For YANG, the network node hostname, the YANG-Push subscription identifier, and concrete XPath data node instances are used to generate the Message Key. The Message Key MUST be derived through the three-phase algorithm described in Section 3.1.2 to guarantee deterministic and unambiguous keys.

The following sections describe the Message Key format, the derivation algorithm, and how Message Keys are used in both Message producers and Message consumers.

3.1.1. Message Key Format

The Message Key is a UTF-8 encoded byte string consisting of exactly three fields separated by a single newline character (LF, U+000A):

  • Line 1: node-name. The managed device identifier, typically a hostname or FQDN, as defined in "ietf-yang-push-telemetry-message" Section 3 of [I-D.ietf-nmop-message-broker-telemetry-message].
  • Line 2: subscription-id. The YANG-Push subscription identifier as a decimal string.
  • Line 3: One or more concrete XPath expressions, sorted lexicographically and deduplicated, joined by " | " (space, pipe, space). Each XPath uniquely identifies one data node instance within the notification.

The key MUST NOT contain a trailing newline after the XPath line. This ensures byte-identical keys for identical inputs, which is the invariant required for Message Broker topic compaction.

Figure 1 shows a Message Key for a single interface instance on node "router-nyc-01" with subscription identifier "1042".

router-nyc-01
1042
/ietf-interfaces:interfaces/interface[name='eth0']
Figure 1: Message Key for a Single Interface Instance

Figure 2 shows a Message Key where a single notification carries two interface instances. The XPaths are sorted lexicographically and joined with " | ".

=============== NOTE: '\' line wrapping per RFC 8792 ================
router-nyc-01
1042
/ietf-interfaces:interfaces/interface[name='eth0']\
 | /ietf-interfaces:interfaces/interface[name='eth1']
Figure 2: Message Key for Multiple Interface Instances

Figure 3 shows a Message Key for a container target (no list ancestor). The XPath is the path to the container itself with no key predicates.

router-nyc-01
1042
/ietf-system:system/clock
Figure 3: Message Key for a Container Target

This line-delimited format guarantees deterministic serialization without the ambiguities of structured encodings such as JSON (where key ordering and whitespace may vary across implementations). The key is a plain byte string suitable for direct use as a Message Broker record key.

3.1.2. YANG Message Broker Producer

The Message Key is derived through a three-phase algorithm. Phase 1 is only required when the YANG-Push subscription uses a subtree filter (Section 6 of [RFC6241]); XPath subscriptions skip directly to Phase 2. Phase 2 runs once at subscription creation time. Phase 3 runs for every notification.

+-------------------+       +--------------+
| Subtree Filter    |------>| Phase 1:     |---> Normalized XPath(s)
| (if applicable)   |       | Normalize    |
+-------------------+       +--------------+
                                    |
                                    v
+-------------------+       +--------------+     +---------------+
| Subscription Path |------>|              |     | Key Templates |
| (XPath, possibly  |       |  Phase 2:    |---->| + Extraction  |
|  from Phase 1)    |       |  Schema      |     |   Specs       |
+-------------------+       |  Resolution  |     +---------------+
| YANG Schema Path  |------>|              |
+-------------------+       +--------------+
                                    |
                                    v
+-------------------+       +--------------+     +--------------+
| Parsed Data Path  |------>|              |     | Message Key  |
+-------------------+       |  Phase 3:    |---->| (line-       |
| Node Name         |------>|  Data Walk   |     |  delimited)  |
| Subscription ID   |------>|              |     |              |
+-------------------+       +--------------+     +--------------+
Figure 4: Three-Phase Message Key Derivation Algorithm
3.1.2.1. Phase 1: Subtree Filter Normalization

Section 3.6 of [RFC8641] defines how YANG data nodes can be subscribed with subtree and xpath selection filters. When a subtree filter is used, the XML representation MUST first be normalized into one or more equivalent XPath expressions before proceeding to Phase 2.

The normalization follows the element classification defined in Section 6.2 of [RFC6241]. Each child element in the subtree filter is classified based on its content:

  • Content match node: The element has text content and no child elements. It becomes an XPath predicate of the form [module:key='value'] on the parent path step.
  • Selection node: The element has no text content and no child elements. It becomes a terminal XPath branch.
  • Containment node: The element has child elements (with no text content or only whitespace). It becomes an intermediate path step and the algorithm recurses into its children.

An element whose text content consists only of whitespace (spaces, tabs, newlines) MUST NOT be treated as a content match. It is classified as a selection or containment node depending on whether it has children.

Each path segment in the output carries the YANG module-name prefix. Duplicate XPath branches MUST be deduplicated. Multiple branches are joined with " | ".

For example, the subtree filter shown in Figure 5 normalizes to the XPath expression shown in Figure 6.

<interfaces xmlns=
    "urn:ietf:params:xml:ns:yang:ietf-interfaces">
  <interface>
    <name>eth0</name>
    <oper-status/>
  </interface>
</interfaces>
Figure 5: Subtree Filter Example
/ietf-interfaces:interfaces/ietf-interfaces:interface
    [ietf-interfaces:name='eth0']
    /ietf-interfaces:oper-status
Figure 6: Normalized XPath from Subtree Filter

In this example the "name" element is classified as a content match (it has text content "eth0"), producing the predicate [ietf-interfaces:name='eth0'] on the "interface" path step. The "oper-status" element is classified as a selection node, becoming the terminal branch.

3.1.2.2. Phase 2: Key Template Derivation

Given a subscription XPath (from Phase 1 or directly from an XPath subscription) and a compiled YANG schema context, Phase 2 resolves each union branch to its target schema node, walks the ancestor chain from root to target, and builds a key template.

The subscription XPath is first split on top-level "|" into individual branches (respecting brackets, quotes, and parentheses). Each branch is processed independently.

For each branch, the algorithm resolves the XPath to its target schema node using the YANG Schema Path. It then walks the ancestor chain from the schema root to the target node, building the key template path. For each node in the path:

  • The path segment uses minimal-prefix style: the YANG module name appears only on the first segment or when the module changes from the previous segment.
  • If the node is a YANG list (Section 7.8 of [RFC7950]), key predicates are appended for each schema-defined key leaf in schema order. If the subscription XPath contains a matching predicate with a literal value, that value is embedded in the template (pinned key). Otherwise, a placeholder marker is used indicating that the value must be extracted from the notification data at runtime (open key).
  • If the target node is a YANG leaf-list, a value predicate [.='...'] is appended, either as a pinned literal or as an open placeholder.
  • YANG choice and case nodes are skipped as they are not data nodes.

The following predicate normalizations are applied during template derivation:

  • No predicate in subscription: open placeholder + extraction spec.
  • [key='value']: literal value embedded as-is (pinned).
  • [module:key='value']: module prefix stripped, literal value embedded.
  • [key="value"]: double-quote normalized to single-quote.
  • [N] (positional predicate): treated as open (positional subscripts do not identify a specific instance by key).

Figure 7 shows the key template derived from the XPath subscription "/ietf-interfaces:interfaces/interface". Because the "interface" list has a key leaf "name" and the subscription does not pin it, the template contains an open placeholder for the "name" key.

Template:    /ietf-interfaces:interfaces/interface[name='%s']
Extraction:  /ietf-interfaces:interfaces/interface/name
Figure 7: Key Template for an Open Interface Subscription

Figure 8 shows the key template when the subscription pins the outer list key but leaves the inner list open: "/ietf-interfaces:interfaces/interface[name='eth0'] /ietf-ip:ipv4/address".

Template:    /ietf-interfaces:interfaces/interface[name='eth0']/ietf-ip:ipv4/address[ip='%s']
Extraction:  /ietf-interfaces:interfaces/interface[name='eth0']/ietf-ip:ipv4/address/ip
Figure 8: Key Template with Pinned Outer Key and Open Inner Key

Phase 2 produces one key template per union branch, together with extraction specifications that describe which key leaf values must be extracted from the data tree at runtime to fill each open placeholder. Each extraction is expressed as an absolute XPath that identifies the key leaf in the data tree. The path mirrors the template path from the root to the owning list (preserving any pinned predicates on ancestor lists) and appends the key leaf name without a predicate:

/MODULE:CONTAINER/.../LIST/KEY-LEAF-NAME

When evaluated against a notification data tree, this XPath selects the key leaf value(s) of the matching list instance(s). For leaf-list targets, the extraction is simply "." (the data node's own value).

3.1.2.3. Phase 3: Runtime Key Production

For each notification, the parsed data tree is walked and matched against the branch templates from Phase 2. For each matching data node instance:

  1. The open placeholders in the key template are filled with the actual key leaf values from the data tree by evaluating the extraction XPath from Phase 2. Each extraction XPath selects the key leaf value(s) of the matching list instance(s) in the notification data.
  2. For leaf-list targets, the data node's own value fills the [.='%s'] placeholder.

As an optimization, implementations need not invoke a full XPath evaluator for each extraction. Because the extraction path always leads to a key leaf of an ancestor list, it can be rewritten to an equivalent "ancestor-or-self" axis expression evaluated relative to the matched data node. For example, the extraction "/ietf-interfaces:interfaces/interface/name" becomes "ancestor-or-self::ietf-interfaces:interface/name". This reduces evaluation to a simple upward tree walk from the matched data node to the first ancestor whose schema node matches the list name, followed by reading a direct child. This yields O(d) complexity where d is the depth of the data tree, which is typically small (3 to 8 levels for common YANG models).

All filled XPath expressions (concrete XPaths) are collected, deduplicated, and sorted lexicographically. The final Message Key is then composed as defined in Section 3.1.1: the node-name and subscription-id on separate lines, followed by the concrete XPaths joined by " | " on a third line.

Figure 9 shows the complete derivation for a notification carrying two interface instances.

=============== NOTE: '\' line wrapping per RFC 8792 ================
Subscription XPath:
  /ietf-interfaces:interfaces/interface

Phase 2 Template:
  /ietf-interfaces:interfaces/interface[name='%s']

Notification data (XML):
  <interfaces xmlns=
      "urn:ietf:params:xml:ns:yang:ietf-interfaces">
    <interface>
      <name>eth0</name>
      <oper-status>up</oper-status>
    </interface>
    <interface>
      <name>eth1</name>
      <oper-status>down</oper-status>
    </interface>
  </interfaces>

Concrete XPaths (sorted):
  /ietf-interfaces:interfaces/interface[name='eth0']
  /ietf-interfaces:interfaces/interface[name='eth1']

Message Key:
  router-nyc-01
  1042
  /ietf-interfaces:interfaces/interface[name='eth0']\
        | /ietf-interfaces:interfaces/interface[name='eth1']
Figure 9: End-to-End Example: Subscription to Key

When the subscription targets a YANG container (with no list ancestor), there are no open placeholders and no instances to match. In that case, the key template itself is used as the single concrete XPath in the Message Key.

3.1.2.4. Subtree Filter End-to-End Example

Figure 10 illustrates the complete three-phase derivation starting from a subtree filter subscription that spans two YANG modules. The filter selects the "oper-status" leaf of interface "eth0" from "ietf-interfaces" [RFC8343] (concrete branch, key pinned in the filter) and the "serial-num" leaf of every hardware component from "ietf-hardware" [RFC8348] (non-concrete branch, no key value in the filter). Phase 1 produces a union of two branches. Phase 2 derives one fully-concrete template (key pinned) and one template with an open placeholder for the component "name" key. Phase 3 evaluates the open placeholder against the notification data, expanding the single template into one concrete XPath per matching component instance.

=============== NOTE: '\' line wrapping per RFC 8792 ================
Subtree filter:
  <interfaces xmlns=
      "urn:ietf:params:xml:ns:yang:ietf-interfaces">
    <interface>
      <name>eth0</name>
      <oper-status/>
    </interface>
  </interfaces>
  <hardware xmlns=
      "urn:ietf:params:xml:ns:yang:ietf-hardware">
    <component>
      <serial-num/>
    </component>
  </hardware>

Phase 1 (Normalize):
  /ietf-interfaces:interfaces/ietf-interfaces:interface[ietf-interfaces\
        :name='eth0']/ietf-interfaces:oper-status
  | /ietf-hardware:hardware/ietf-hardware:component/ietf-hardware\
        :serial-num

Phase 2 (Two branch templates):
  Branch 0 (leaf target, key pinned):
    /ietf-interfaces:interfaces/interface[name='eth0']/oper-status
    (fully concrete)

  Branch 1 (leaf target, key open):
    /ietf-hardware:hardware/component[name='%s']/serial-num
    Extraction:
      /ietf-hardware:hardware/component/name
    (open placeholder: name)

Notification data (XML):
  <interfaces xmlns="urn:ietf:params:xml:ns:yang:ietf-interfaces">
    <interface>
      <name>eth0</name>
      <oper-status>up</oper-status>
    </interface>
  </interfaces>
  <hardware xmlns="urn:ietf:params:xml:ns:yang:ietf-hardware">
    <component>
      <name>chassis</name>
      <serial-num>SN-12345</serial-num>
    </component>
    <component>
      <name>fan-1</name>
      <serial-num>SN-67890</serial-num>
    </component>
  </hardware>

Phase 3 (Key production):

Concrete XPaths (sorted):
  /ietf-hardware:hardware/component[name='chassis']/serial-num
  /ietf-hardware:hardware/component[name='fan-1']/serial-num
  /ietf-interfaces:interfaces/interface[name='eth0']/oper-status

Message Key:
  router-nyc-01
  1042
  /ietf-hardware:hardware/component[name='chassis']/serial-num | /ietf-hardware:hardware/component[name='fan-1']/serial-num | /ietf-interfaces:interfaces/interface[name='eth0']/oper-status
Figure 10: End-to-End Example Starting from Subtree Filter

When the Message is being produced to the Message Broker, the network node hostname is used from the structured YANG data defined in "ietf-yang-push-telemetry-message" Section 3 of [I-D.ietf-nmop-message-broker-telemetry-message]. The concrete XPath expressions are derived from the subscription filter and the YANG Schema Path as described above, and then instantiated with values from each notification data path. Figure 11 represent the Message Key and and the Message Broker headers with the Schema ID and contend type of the Message and Figure 12 the Message itself.

NOTE: The Key is a two-line byte string; the line break between the
      node hostname and the YANG path is a literal LF (U+000A).

Key:
  router-nyc-01
  1042
  /ietf-interfaces:interfaces/interface[name='eth0']

Headers:
  schema-id:     1
  content-type:  application/yang-data+json
Figure 11: Telemetry Message Broker Key and Header Example
{
 "ietf-telemetry-message:message": {
   "network-node-manifest": {
     "name": "pe1",
     "vendor": "open source",
     "software-version": "1.1.2"
   },
    "telemetry-message-metadata": {
      "node-export-timestamp": "2024-02-14T12:10:10.10+01:00",
      "collection-timestamp": "2024-02-14T12:10:10.12+01:00",
      "session-protocol": "yang-push",
      "notification-event": "log",
      "export-address": "192.168.1.100",
      "export-port": 123,
      "collection-address": "192.168.1.1",
      "collection-port": 9991,
      "ietf-yang-push-telemetry-message:yang-push-subscription": {
        "id": 89,
        "xpath-filter": "/ietf-interfaces:interfaces",
        "datastore": "ietf-datastores:operational",
        "transport": "ietf-udp-notif-transport:udp-notif",
        "encoding": "ietf-subscribed-notifications:encode-json",
        "module": [
          {
            "module": "ietf-interfaces",
            "revision": "2018-02-20",
            "version": "2.0.0"
          }
        ],
        "yang-library-content-id": "abc"
      }
    },
    "data-collection-manifest": {
      "name": "collector-1",
      "vendor": "open source",
      "software-version": "2.1.0"
    },
    "network-operator-metadata": {
      "labels": [
        {
          "name": "platform-name",
          "string-value": "name"
        }
      ]
    },
    "payload": {
      "ietf-yang-push:push-update": {
        "id": 1042,
        "datastore-contents": {
          "ietf-interfaces:interfaces": {
            "interface": [
              {
                "name": "eth0",
                "oper-status": "down"
              }
            ]
          }
        }
      }
    }
  }
 }
}
Figure 12: Telemetry Message Example

3.1.3. YANG Message Broker Consumer

The consumer hashes the Message Key, applies modulo with the number of partitions, and determines the partition from which it should consume messages bearing that Message Key.

To parse the Message Key, the consumer splits the byte string on newline (LF) characters. The first line is the node-name, the second is the subscription-id, and the third line contains the concrete XPath expression(s) joined by " | ".

At a YANG data store, such as a Time Series database or stream processor, the YANG data could then be ingested into tables according to topic names and indexed per Message Key. If Topic Compaction is enabled, only current state is consumed.

3.1.4. Time Series Database

Depending if the YANG Data Consumer knows the Message Key from the YANG Message Broker Consumer or the YANG Schema from the YANG Schema Registry the network telemetry messages can be indexed in a Time series database. The Message Key could serve as the primary key, while the individual fields (node-name, subscription-id, concrete XPaths) can be reflected in the indexing scheme using primary and secondary keys in a time series database. Implementation examples can be found under Section 5.

3.2. YANG-Push Message Broker Topic Naming

Each YANG-Push subscription requires a deterministic, human-readable Message Broker topic name. The topic name MUST satisfy the following requirements:

  • Deterministic: The same subscription, regardless of syntactic form (XPath vs. subtree filter, redundant module prefixes, single-quoted vs. double-quoted predicates), MUST always produce the same topic name.
  • Unique: Two subscriptions targeting different YANG schema nodes MUST NOT share a topic name.
  • Human-readable: An operator inspecting a topic listing SHOULD be able to identify which YANG data the topic carries.
  • Stable under schema evolution: Augmenting the YANG schema with new nodes MUST NOT change existing topic names. Any optimization that depends on the current set of schema siblings (e.g. dropping zero-entropy wrapper containers or using shortest-unique-prefix abbreviation) is therefore unsafe and MUST NOT be used.
  • Within Message Broker limits: Topic names MUST contain only characters permitted by the Message Broker (for Apache Kafka: [a-zA-Z0-9._-], maximum 249 characters).

The topic name is derived from the Phase 2 key template (see Section 3.1.2.2) through a purely mechanical transformation of the YANG schema DATA path. No additional schema resolution is needed beyond Phase 2.

3.2.1. Topic Name Derivation Algorithm

The input is the Phase 2 key template for one branch of the subscription. For union subscriptions with multiple branches, each branch produces its own topic name independently.

The derivation proceeds in five steps:

  1. Strip Predicates: Remove all predicate expressions ([...]) from the key template. This yields the schema DATA path, the structural identity of the subscription target, independent of any specific instance. For example, "/ietf-interfaces:interfaces/interface[name='%s']" becomes "/ietf-interfaces:interfaces/interface".
  2. Replace Module Names with YANG Prefixes: Walk the path segments. Wherever a segment carries a "module-name:local-name" prefix, look up the module in the YANG schema context and substitute its Section 6.5 of prefix statement [RFC7950]. YANG module prefixes are short by convention (2-6 characters), unique within any loaded schema context, and immutable once a module is published (Section 4.1 of [RFC7950]). For example, "/ietf-interfaces:interfaces/interface" becomes "/if:interfaces/interface".
  3. Flatten to Topic Name: Apply three mechanical substitutions: (a) remove the leading "/", (b) replace every ":" with "-", (c) replace every "/" with "-". For example, "if:interfaces/interface" becomes "if-interfaces-interface". All resulting characters ([a-z0-9-]) are valid in Message Broker topic names.
  4. Prepend Organization Prefix (Optional): When an organization, team, or project prefix is configured, it is prepended with a "-" separator. For example, "if-interfaces-interface" becomes "netops-if-interfaces-interface". The prefix MUST contain only Message Broker safe characters. When no prefix is configured, this step is a no-op.
  5. Handle Overflow: If the resulting name exceeds the maximum topic name length (configurable, default 255), it is truncated at the last "-" boundary that keeps the name within the budget, and an 8-character hexadecimal hash suffix (FNV-1a 64-bit of the full Schema Path) is appended for uniqueness. In practice, overflow rarely triggers — the longest realistic YANG paths produce topic names of 50-80 characters.

Figure 13 shows the derivation for three subscription paths.

=============== NOTE: '\' line wrapping per RFC 8792 ================
 Subscription XPath                    | Topic Name
---------------------------------------+--------------------------------
/ietf-interfaces:interfaces/interface  | if-interfaces-interface
/ietf-interfaces:interfaces/interface\ | if-interfaces-interface-oper\
/oper-status                           | -status
/ietf-system:system/clock              | sys-system-clock
/ietf-system:system/dns-resolver/server| sys-system-dns-resolver-server
Figure 13: Topic Name Derivation Examples

Figure 14 shows the same paths with an organization prefix "netops".

 Subscription XPath                    | Topic Name
---------------------------------------+--------------------------------
/ietf-interfaces:interfaces/interface  | netops-if-interfaces-interface
/ietf-interfaces:interfaces/interface\ | netops-if-interfaces-interface\
/oper-status                           | -oper-status
/ietf-system:system/clock              | netops-sys-system-clock
Figure 14: Topic Names with Organization Prefix

3.2.2. Properties

The topic naming algorithm has the following properties:

  • The mapping is injective (one-to-one): given a topic name, the original Schema Path can be reconstructed by reversing the substitutions. Different Schema Paths always produce different topic names.
  • The topic name depends only on the subscription's own path segments and their YANG module prefixes. It does not depend on what other nodes exist at the same schema level. Augmenting the schema adds new topic names but never changes existing ones.
  • The "-" separator creates a natural hierarchy for pattern matching. For example, "^if-interfaces-interface-.*" matches all interface leaf topics, and "^if-.*" matches all topics from the ietf-interfaces module.

3.2.3. YANG Message Broker Producer

The YANG Message Broker Producer derives the topic name from the YANG-Push subscription's xpath or subtree filter by running Phases 1 and 2 (as described in Section 3.1.2) and then applying the topic name derivation algorithm above. The subscription type ("periodic", "on-change", or "on-change" with "sync-on-start") MAY be encoded in a separate topic hierarchy level, depending on the deployment's naming policy. Where "periodic" is encoded as "stats", "on-change" as "state-change", "on-change" with "sync-on-start" as "state" and "on-change" with "sync-on-start" whre topic compaction is enabled as "current-state".

3.2.4. YANG Message Broker Consumer

The consumer can subscribe to multiple topics using wildcard or regex patterns. For example:

  • All interface data: "^if-interfaces-interface.*"
  • All data from a specific module: "^if-.*" (ietf-interfaces) or "^sys-.*" (ietf-system)
  • All data from an organization: "^netops-.*"
  • All current-state data from an organization: "^netops-current-state-.*"

The YANG data is then ingested into tables according to topic names and indexed per Message Key. If Topic Compaction is enabled, only the current state is consumed.

4. Message Broker Implementations

Topic, Partitioning and Message Keys are generic concepts of Message Brokers. There are two known Message Broker implementations supporting all features described in this document.

4.1. Apache Kafka

Apache Kafka supports Message Keys, Partitioning and Log Compaction.

With the following example from the Apache Kafka admin client API https://kafka.apache.org/41/javadoc/org/apache/kafka/clients/admin/Admin.html a new compacted Topic can be created.

Properties props = new Properties();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");

try (Admin admin = Admin.create(props)) {
 String topicName = "my-topic";
 int partitions = 12;
 short replicationFactor = 3;
 // Create a compacted topic
 CreateTopicsResult result = admin.createTopics(Collections.singleton(
  new NewTopic(topicName, partitions, replicationFactor)
   .configs(Collections.singletonMap(TopicConfig.CLEANUP_POLICY_CONFIG,/
   TopicConfig.CLEANUP_POLICY_COMPACT))));

 // Call values() to get the result for a specific topic
 // KafkaFuture<Void> future = result.values().get(topicName);

 // Call get() to block until the topic creation is complete or has
 // failed if creation failed the ExecutionException wraps the
 // underlying cause. future.get();
}

The most important configuration items from https://kafka.apache.org/41/configuration/topic-configs/ are "topicName" defines the Topic name, "partitions" the amount of partitions, "replicationFactor" how many times the partition is being replicated.

With "compact" in "cleanup.policy" the log compaction can be turned on per topic. With "min.cleanable.dirty.ratio" and "delete.retention.ms" how often and when Log Compaction should occur per topic. Where with "retention.bytes" and with "retention.ms" the topic specific compaction configurations can be limited how often the topics are compacted.

The topic names are constrained to 249 character length and the following characters: "a-z", "A-Z", "0-9", ".", "_" and "-". Topics can be created on the fly by producing into a new Topic when "auto.create.topics.enable" has been configured prior. Topics should be deleted at the end of the lifecycle through the "kafka-topics.sh" command.

The Partition count for a given Topic can be increased but not decreased. Consumer groups are automatically re-joined and partitions are being rebalanced on Message Broker nodes when Partition count changed.

4.2. Apache Pulsar

Apache Pulsar supports Message Keys, Partitioning and Topic Compaction.

With "brokerServiceCompactionThreshold" when Topic Compaction should occur is being configured.

The topic names allow all characters except: "/". Topics can be created on the fly by producing into a new Topic when "allowAutoTopicCreation" has been configured prior. Topics should be deleted at the end of the lifecycle through pulsar-admin or pulsarctl tools.

The Partition count for a given Topic can be increased but not decreased. Consumer groups are automatically re-joined and partitions are being rebalanced on Message Broker nodes when Partition count changed.

5. Time Series Database Implementations

Tables, partition and keys are generic concepts of time series databases. With ClickHouse, this document provides examples of how YANG message keys can be obtained from the Message Broker and used for indexing.

5.1. ClickHouse

5.1.1. Data Model

Unlike other realtime analytics databases, ClickHouse does not (necessarily) rely on partitioning data by timestamp. ClickHouse represents data in the MergeTree format, which is similar to a LSM tree:

A table consists of data parts sorted by primary key.

When data is inserted in a table, separate data parts are created and each of data part is lexicographically sorted by primary key. For example, if the primary key is ("MessageKey", "Date"), the data in the part is sorted by "MessageKey", and within each "MessageKey", it is ordered by "Date".

Data belonging to different partitions are separated into different parts. In the background, ClickHouse merges data parts for more efficient storage. Parts belonging to different partitions are not merged. The merge mechanism does not guarantee that all rows with the same primary key will be in the same data part.

Each data part is logically divided into granules. A granule is the smallest indivisible data set that ClickHouse reads when selecting data. ClickHouse does not split rows or values, so each granule always contains an integer number of rows. The first row of a granule is marked with the value of the primary key for the row. For each data part, ClickHouse creates an index file that stores the marks. For each column, whether it's in the primary key or not, ClickHouse also stores the same marks. These marks let you find data directly in column files.

Thus, it is possible to quickly run queries on one or many ranges of the primary key.

5.1.2. Message Broker Integration

ClickHouse integrates with Message Brokers through Integration Table Engines.

Reading (selecting) data through Kafka Table Engine follows Apache Kafka semantics of advancing the offset, so subsequent reads will start at the offset the previous read left off.

It is the responsibility of the data model designer to transfer data to a regular table:

  • Use the engine to create a Kafka consumer and consider it a data stream.

Example:

          CREATE TABLE queue (
              timestamp UInt64,
              level String,
              message String
          )
          ENGINE = Kafka
          SETTINGS kafka_broker_list = 'localhost:9092',
              kafka_topic_list = 'topic',
              kafka_group_name = 'group1',
              kafka_format = 'JSONEachRow',
              kafka_num_consumers = 4;
  • Create a table with the desired structure.

Example:

          CREATE TABLE messages (
              key String,
              timestamp UInt64,
              level String,
              message String
          )
          ENGINE = MergeTree
          ORDER BY (key, timestamp);
  • Create a materialized view that converts data from the engine and puts it into a previously created table.
          CREATE MATERIALIZED VIEW mv_messages TO messages AS
          SELECT
              _key AS key,
              timestamp,
              level,
              message
          FROM queue;

The Message Key and partition ID are available as virtual (read only) columns _key and _partition.

5.1.3. Message Formats

ClickHouse supports numerous Message formats natively. The example above uses the JSON Lines format but other (binary) formats, such as Apache Avro or Protobuf, are supported as well.

5.1.4. Schema Registry

ClickHouse has built in Schema Registry support. For Apache Avro, the Schema Registry and authentication are encoded in additional parameters to the Apache Kafka consumer.

For formats such as Confluent JSON_SR, use the "kafka_schema_registry_skip_bytes" parameter to skip reading the Schema Registry preamble. The Schema can then be encoded explicitly.

6. IANA Considerations

This document includes no request to IANA.

7. Security Considerations

This document should not affect the security of the Internet.

8. Operational Considerations

The YANG Message Broker Producer of a YANG-Push receiver should have three config knobs facilitate the features described in this document as optional:

Subject distribution enables message ordering for a set of YANG Message Keys on each partition. Where in topic distribution messages are randomly being distributed among partitions.

To accommodate for potential date loss throughout the data processing pipeline, periodic update of the current State for State metrics is RECOMMENDED. This can be accommodated with YANG-Push as defined in [RFC8641] by complementing "on-change sync on start" subscriptions with "periodic" subscriptions. Alternatively, in YANG-Push Lite defined in Section 7.6 of [I-D.wilton-netconf-yang-push-lite] this simplified in one subscription.

9. Implementation status

This section provides pointers to existing open source implementations of this draft. Note to the RFC-editor: Please remove this before publishing.

9.1. yang-push-key

A prove of concept implementing the three-phase algorithm described in Section 3.1.2.

The open source code can be accessed here: [yang-push-key].

10. References

10.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC6241]
Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., and A. Bierman, Ed., "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, , <https://www.rfc-editor.org/info/rfc6241>.
[RFC7950]
Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", RFC 7950, DOI 10.17487/RFC7950, , <https://www.rfc-editor.org/info/rfc7950>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8342]
Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K., and R. Wilton, "Network Management Datastore Architecture (NMDA)", RFC 8342, DOI 10.17487/RFC8342, , <https://www.rfc-editor.org/info/rfc8342>.
[RFC8641]
Clemm, A. and E. Voit, "Subscription to YANG Notifications for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641, , <https://www.rfc-editor.org/info/rfc8641>.
[RFC8792]
Watsen, K., Auerswald, E., Farrel, A., and Q. Wu, "Handling Long Lines in Content of Internet-Drafts and RFCs", RFC 8792, DOI 10.17487/RFC8792, , <https://www.rfc-editor.org/info/rfc8792>.
[I-D.ietf-nmop-message-broker-telemetry-message]
Elhassany, A., Graf, T., and P. Lucente, "Extensible YANG Model for Network Telemetry Messages", Work in Progress, Internet-Draft, draft-ietf-nmop-message-broker-telemetry-message-04, , <https://datatracker.ietf.org/doc/html/draft-ietf-nmop-message-broker-telemetry-message-04>.
[I-D.ietf-netconf-notif-envelope]
Feng, A. H., Francois, P., Graf, T., and B. Claise, "Extensible YANG Model for YANG-Push Notifications", Work in Progress, Internet-Draft, draft-ietf-netconf-notif-envelope-05, , <https://datatracker.ietf.org/doc/html/draft-ietf-netconf-notif-envelope-05>.

10.2. Informative References

[RFC8340]
Bjorklund, M. and L. Berger, Ed., "YANG Tree Diagrams", BCP 215, RFC 8340, DOI 10.17487/RFC8340, , <https://www.rfc-editor.org/info/rfc8340>.
[RFC8343]
Bjorklund, M., "A YANG Data Model for Interface Management", RFC 8343, DOI 10.17487/RFC8343, , <https://www.rfc-editor.org/info/rfc8343>.
[RFC8348]
Bierman, A., Bjorklund, M., Dong, J., and D. Romascanu, "A YANG Data Model for Hardware Management", RFC 8348, DOI 10.17487/RFC8348, , <https://www.rfc-editor.org/info/rfc8348>.
[I-D.ietf-nmop-terminology]
Davis, N., Farrel, A., Graf, T., Wu, Q., and C. Yu, "Some Key Terms for Network Fault and Problem Management", Work in Progress, Internet-Draft, draft-ietf-nmop-terminology-23, , <https://datatracker.ietf.org/doc/html/draft-ietf-nmop-terminology-23>.
[I-D.ietf-nmop-yang-message-broker-integration]
Graf, T. and A. Elhassany, "An Architecture for YANG-Push to Message Broker Integration", Work in Progress, Internet-Draft, draft-ietf-nmop-yang-message-broker-integration-12, , <https://datatracker.ietf.org/doc/html/draft-ietf-nmop-yang-message-broker-integration-12>.
[I-D.wilton-netconf-yang-push-lite]
Wilton, R., Keller, H., Claise, B., Aries, E., Cumming, J., and T. Graf, "YANG Datastore Telemetry (YANG Push Lite)", Work in Progress, Internet-Draft, draft-wilton-netconf-yang-push-lite-02, , <https://datatracker.ietf.org/doc/html/draft-wilton-netconf-yang-push-lite-02>.
[Mar24]
Martinez-Casanueva, I. D., Gonzalez-Sanchez, D., Bellido, L., Fernandez, D., and D. R. Lopez, "Toward Building a Semantic Network Inventory for Model-Driven Telemetry", IEEE, DOI 10.1109/MCOM.001.2200222, , <https://arxiv.org/html/2402.06511v1>.
[Bod24]
Bode, J., Kühl, N., Kreuzberger, D., and C. Holtmann, "Toward Avoiding the Data Mess: Industry Insights From Data Mesh Implementations", IEEE, DOI 10.1109/ACCESS.2024.3417291, , <https://arxiv.org/html/2302.01713v4>.
[Deh22]
Dehghani, Z., "Data Mesh", O'Reilly Media, ISBN 9781492092391, , <https://www.oreilly.com/library/view/data-mesh/9781492092384/>.
[Kim96]
Kimball, R. and M. Ross, "The Data Warehouse Toolkit", Wiley, DOI 10.1007/s002360050048, , <https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/books/data-warehouse-dw-toolkit/>.
[One96]
O'Neil, P., Cheng, E., Gawlick, D., and E. O'Neil, "The Log-Structured Merge-Tree", Acta Informatica, ISBN 9781118530801, , <https://www.cs.umb.edu/~poneil/lsmtree.pdf>.
[Kaf11]
Narkhede, N., "Apache Kafka", Apache Software Foundation, , <https://kafka.apache.org/>.
[Pul16]
Guo, S. and M. Merli, "Apache Pulsar", Apache Software Foundation, , <https://pulsar.apache.org/>.
[ConDoc18]
Yokota, R., "Confluent Schema Registry Documentation", Confluent Community and Apache Software Foundation, , <https://docs.confluent.io/platform/current/schema-registry/>.
[yang-push-key]
Elhassany, A., "yang-push-key", <https://github.com/ahassany/yang-push-key>.

Acknowledgements

Thanks to Camilo Cardona, Rob Wilton, Holger Keller, Reshad Rahman, Nigel Davis, Olga Havel and Michael Mackey for their comments and reviews.

We also like to thank Victor Lopez for the initial idea on the network controller use case. Ashley Woods, Sivakumar Sundaravadivel and Rafael Julio for the idea of grouping topics by YANG-Push subscription type and insisting that Topic Compaction is a key enabler for inventory metrics and YANG data consumer integration and should be supported day 1. Nigel Davis for confirming that Topic Compaction simplifies indeed data processing system architecture and Loic Monney for the operational configuration and monitoring details on Apache Kafka.

Contributors

Many thanks goes to Hellmar Becker who contributed Section 3.1.4 and Section 5 on how YANG Message Keys can be obtained from Message Broker, how time series databases can use it for indexing YANG data and example implementation in ClickHouse.

Hellmar Becker
ClickHouse
601 Marshall Street
Redwood City, CA 94063
United States of America

Authors' Addresses

Thomas Graf
Swisscom
Binzring 17
CH-8045 Zurich
Switzerland
Ahmed Elhassany
Swisscom
Binzring 17
CH-8045 Zurich
Switzerland
Alex Huang Feng
Deutsche Telekom
Barcelona
Spain
Benoît Claise
Everything OPS
Liege
Belgium
Paolo Lucente
NTT
Veemweg 23
3771 Barneveld
Netherlands