Internet-Draft HPC/AI scheduler job metadata June 2026
Xiong, et al. Expires 31 December 2026 [Page]
Workgroup:
teas
Internet-Draft:
draft-xkk-teas-hpc-scheduler-job-metadata-00
Published:
Intended Status:
Standards Track
Expires:
Authors:
Q. Xiong
ZTE Corporation
K. Kompella
HPE
D. King
Lancaster University

HPC/AI Scheduler Job Metadata Model

Abstract

This document defines a scheduler-facing metadata model for High Performance Computing (HPC) and AI workloads. The model captures common job, workload, scheduler, tenant, timing, and task metadata that can be mapped from heterogeneous workload managers and orchestration platforms and used as context for network service intent.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 31 December 2026.

Table of Contents

1. Introduction

HPC and AI workflows are commonly managed by workload managers and orchestration systems such as batch schedulers, Kubernetes-based training systems, workflow engines, and higher-level AI platforms. These systems maintain metadata about jobs, tasks, users, tenants, timing, resource requests, and workload structure.

Examples of such systems include HPC workload managers such as Slurm, PBS Pro/OpenPBS, IBM Spectrum LSF, and Grid Engine-style schedulers, as well as AI and machine learning orchestration platforms based on Kubernetes, Kubeflow, Ray, Volcano, Kueue, Red Hat OpenShift AI, NVIDIA Base Command Manager, and NVIDIA Run:ai. These examples are illustrative; the model is intended to be independent of any specific scheduler or orchestration platform.

The requirements reflected in this model are derived from the types of information commonly exposed by such workload schedulers and AI orchestration platforms, including workload identity, job structure, task or role information, timing, placement context, tenant or project context, and correlation identifiers. The intent is to carry the network-relevant subset of this information without requiring the network domain to adopt the native data model of any one scheduler.

The representation of this metadata is platform-specific. For example, an HPC scheduler may identify jobs using scheduler-local job identifiers and queues, while a Kubernetes-based AI platform may use namespaces, custom resources, pod sets, and workload admission objects. A common metadata model allows the network-relevant portions of these platform-specific job descriptions to be represented in a consistent form.

The broader HP-WAN context and current deployment considerations are described in [I-D.kcrh-hpwan-state-of-art] and [I-D.xhy-hpwan-framework]. This document focuses on the scheduler and job metadata needed to relate workload context to that network environment.

Related work on machine learning cluster scheduling, including [I-D.kompella-rtgwg-mlnwsched], illustrates that job timing, placement, and resource context can be relevant beyond the compute scheduler itself. This document provides a platform-neutral way to carry scheduler and job metadata that can be used for correlation with network service intent.

This document defines a YANG model for scheduler and job metadata. It does not define the requested network service itself and does not define how that service is realized in the network. The metadata defined here is intended to be used by a service intent model that expresses the desired connectivity outcome for the workload.

2. Conventions Used in This Document

2.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Terminology

This document defines common terminology used by the HPC/AI scheduler job metadata model, the HPC/AI service intent model, and the HPC/AI tunnel realization model.

Workload:
A unit of work submitted to, or managed by, a workload manager or orchestration platform. A workload can be an HPC batch workload, an AI training workload, an inference workload, a data movement workflow, or another scheduled application-level activity.
Job:
A scheduler-visible execution object associated with a workload. A job is identified by the originating scheduler or orchestration platform and can contain one or more tasks, roles, replicas, or execution units.
Task:
A component of a job that represents a schedulable or executable part of the workload. Examples include an HPC task, an MPI rank group, a training worker, a parameter-server role, or a workflow stage.
Scheduler:
A workload manager or orchestration system that creates, admits, places, or manages workloads and jobs. Examples include HPC batch schedulers and Kubernetes-based AI orchestration systems.
Scheduler Job Metadata:
Platform-neutral context describing the originating scheduler, submitter, workload, job, task structure, and timing information. Scheduler job metadata identifies and describes the workload but does not request network connectivity.
Service Intent:
A request for a network service associated with a workload or job. Service intent describes the desired connectivity outcome, including endpoints, communication pattern, timing, data movement, performance objectives, policy preferences, and admission state. It does not prescribe the network mechanism used to realize the service.
Tunnel Realization:
The network-side realization of an admitted service intent. A tunnel realization can reference tunnels, paths, policy, protection, resource allocation, lifecycle state, and performance monitoring associated with the service intent.
Correlation Identifier:
An identifier used to associate scheduler job metadata, service intent, and tunnel realization state across systems that may use different native identifiers.

4. Model Scope

The scheduler job metadata model provides workload context that can be consumed by a network service intent system. It includes identifiers and descriptive attributes that allow a network controller, orchestrator, or broker to correlate a network service request with the originating workload manager and job.

The model is intended to be independent of a specific workload manager. Platform-specific identifiers are carried as metadata and do not imply that the network controller understands the internal scheduling behavior of the originating platform.

This model is intended to provide a stable boundary between workload scheduling systems and IETF-defined interfaces used by data center and inter-data-center network orchestration systems.

5. Model Structure

      module: ietf-hpc-scheduler-job-metadata
        +--rw hpc-scheduler-job-metadata
           +--rw scheduler
           |  +--rw scheduler-id?            string
           |  +--rw scheduler-name?          string
           |  +--rw scheduler-type?          identityref
           |  +--rw platform-instance?       string
           +--rw submitter
           |  +--rw tenant-id?               string
           |  +--rw project-id?              string
           |  +--rw namespace?               string
           |  +--rw user-id?                 string
           |  +--rw account-id?              string
           +--rw workload
           |  +--rw workload-id?             string
           |  +--rw workload-name?           string
           |  +--rw workload-type?           identityref
           |  +--rw framework?               identityref
           |  +--rw priority?                uint32
           |  +--rw queue?                   string
           |  +--rw correlation-id?          string
           +--rw job
           |  +--rw job-id?                  string
           |  +--rw job-name?                string
           |  +--rw job-array-id?            string
           |  +--rw job-size?                uint32
           |  +--rw task* [task-id]
           |     +--rw task-id               string
           |     +--rw task-name?            string
           |     +--rw task-role?            identityref
           |     +--rw task-index?           uint32
           +--rw timing
              +--rw submit-time?             yang:date-and-time
              +--rw earliest-start-time?     yang:date-and-time
              +--rw requested-start-time?    yang:date-and-time
              +--rw deadline?                yang:date-and-time
              +--rw requested-duration?      uint32
              +--rw duration-unit?           identityref

              Figure 2: Scheduler job metadata model structure

6. Relationship to Other Models

The naming relationship between these concepts is hierarchical.

* Scheduler job metadata in this document identifies and describes the workload.

* A service intent as per draft-xkk-teas-hpc-service-intent identifies the network service requested for that workload.

* A tunnel realization as per draft-xkk-teas-hpc-tunnel-realization identifies the network resources used to realize an admitted service intent.

      .----------------------------.
      | Scheduler/Job Metadata     |
      | workload-id, job-id,       |
      | task-id, correlation-id    |
      '-------------+--------------'
                    |
                    | referenced by
                    v
      .-------------+--------------.
      | Service Intent             |
      | intent-id, workload-ref,   |
      | endpoints, objectives      |
      '-------------+--------------'
                    |
                    | admitted and realized by
                    v
      .-------------+--------------.
      | Tunnel Realization         |
      | realization-id, intent-ref,|
      | tunnel/path references     |
      '----------------------------'

              Figure 1: Relationship

A workload or job can have zero or more service intent instances. A service intent instance can have zero or more tunnel realization instances. A tunnel realization instance is associated with one service intent instance, although the underlying network service may use one or more tunnels, paths, or technology-specific constructs.

The scheduler job metadata model provides context for a separate service intent request. A service intent instance can refer to the metadata instance using a workload identifier, job identifier, or correlation identifier. This separation allows multiple service intent requests to be associated with a single workload, and allows one service intent request to be updated or replaced without changing the scheduler metadata.

7. YANG Data Model

The YANG data model is as follows:


module ietf-hpc-scheduler-job-metadata {
  yang-version 1.1;
  namespace "urn:ietf:params:xml:ns:yang:ietf-hpc-scheduler-job-metadata";
  prefix hpc-sched;

  import ietf-yang-types {
    prefix yang;
    reference
      "RFC 6991: Common YANG Data Types";
  }

  organization
    "IETF Traffic Engineering Architecture and Signaling (TEAS)
     Working Group";
  contact
    "WG Web:   <https://datatracker.ietf.org/wg/teas/>
     WG List:  <mailto:teas@ietf.org>

     Editor:   Quan Xiong
               <mailto:xiong.quan@zte.com.cn>

     Editor:   Kireeti Kompella
               <mailto:kireeti.ietf@gmail.com>

     Editor:   Daniel King
               <mailto:d.king@lancaster.ac.uk>";

  description
    "This module defines a scheduler-facing metadata model for
     High Performance Computing (HPC) and AI workloads. The model
     captures common job, workload, scheduler, tenant, timing, and
     task metadata that can be mapped from heterogeneous workload
     managers and orchestration platforms.

     Copyright (c) 2026 IETF Trust and the persons identified as
     authors of the code. All rights reserved.

     Redistribution and use in source and binary forms, with or
     without modification, is permitted pursuant to, and subject
     to the license terms contained in, the Revised BSD License
     set forth in Section 4.c of the IETF Trust's Legal Provisions
     Relating to IETF Documents
     (https://trustee.ietf.org/license-info).

     This version of this YANG module is part of RFC XXXX; see
     the RFC itself for full legal notices.";

  revision 2026-04-23 {
    description
      "Initial version of the HPC/AI scheduler job metadata model.";
    reference
      "RFC XXXX: HPC/AI Scheduler Job Metadata Model";
  }

  /*
   * Identity definitions
   */
  identity scheduler-type {
    description
      "Base identity for scheduler types.";
  }

  identity slurm {
    base scheduler-type;
    description
      "Slurm workload manager.";
  }

  identity pbs {
    base scheduler-type;
    description
      "PBS Pro/OpenPBS workload manager.";
  }

  identity lsf {
    base scheduler-type;
    description
      "IBM Spectrum LSF workload manager.";
  }

  identity kubernetes {
    base scheduler-type;
    description
      "Kubernetes-based orchestration platform.";
  }

  identity kubeflow {
    base scheduler-type;
    description
      "Kubeflow AI orchestration platform.";
  }

  identity workload-type {
    description
      "Base identity for workload types.";
  }

  identity hpc-batch {
    base workload-type;
    description
      "HPC batch workload.";
  }

  identity ai-training {
    base workload-type;
    description
      "AI training workload.";
  }

  identity ai-inference {
    base workload-type;
    description
      "AI inference workload.";
  }

  identity data-movement {
    base workload-type;
    description
      "Data movement workload.";
  }

  identity framework {
    description
      "Base identity for workload frameworks.";
  }

  identity mpi {
    base framework;
    description
      "Message Passing Interface (MPI) framework.";
  }

  identity tensorflow {
    base framework;
    description
      "TensorFlow machine learning framework.";
  }

  identity pytorch {
    base framework;
    description
      "PyTorch machine learning framework.";
  }

  identity task-role {
    description
      "Base identity for task roles.";
  }

  identity worker {
    base task-role;
    description
      "Worker role in distributed computation.";
  }

  identity parameter-server {
    base task-role;
    description
      "Parameter server role in distributed training.";
  }

  identity master {
    base task-role;
    description
      "Master/coordinator role.";
  }

  identity duration-unit {
    description
      "Base identity for duration units.";
  }

  identity seconds {
    base duration-unit;
    description
      "Duration in seconds.";
  }

  identity minutes {
    base duration-unit;
    description
      "Duration in minutes.";
  }

  identity hours {
    base duration-unit;
    description
      "Duration in hours.";
  }

  /*
   * Typedefs
   */
  typedef priority-type {
    type uint32 {
      range "0..1000";
    }
    description
      "Priority value type, with higher values indicating higher priority.";
  }

  /*
   * Groupings
   */
  grouping scheduler-grouping {
    description
      "Scheduler identification and metadata.";
    leaf scheduler-id {
      type string;
      description
        "Unique identifier for the scheduler instance.";
    }
    leaf scheduler-name {
      type string;
      description
        "Human-readable name of the scheduler.";
    }
    leaf scheduler-type {
      type identityref {
        base scheduler-type;
      }
      description
        "Type of scheduler or orchestration platform.";
    }
    leaf platform-instance {
      type string;
      description
        "Platform-specific instance identifier or version.";
    }
  }

  grouping submitter-grouping {
    description
      "Submitter and tenant context.";
    leaf tenant-id {
      type string;
      description
        "Tenant identifier for multi-tenant environments.";
    }
    leaf project-id {
      type string;
      description
        "Project identifier within the tenant.";
    }
    leaf namespace {
      type string;
      description
        "Namespace identifier (e.g., Kubernetes namespace).";
    }
    leaf user-id {
      type string;
      description
        "User identifier who submitted the workload.";
    }
    leaf account-id {
      type string;
      description
        "Accounting or billing account identifier.";
    }
  }

  grouping workload-grouping {
    description
      "Workload identification and metadata.";
    leaf workload-id {
      type string;
      description
        "Unique identifier for the workload.";
    }
    leaf workload-name {
      type string;
      description
        "Human-readable name of the workload.";
    }
    leaf workload-type {
      type identityref {
        base workload-type;
      }
      description
        "Type of workload.";
    }
    leaf framework {
      type identityref {
        base framework;
      }
      description
        "Computational framework used by the workload.";
    }
    leaf priority {
      type priority-type;
      description
        "Priority of the workload.";
    }
    leaf queue {
      type string;
      description
        "Queue or partition where the workload is submitted.";
    }
    leaf correlation-id {
      type string;
      description
        "Correlation identifier for cross-system tracing.";
    }
  }

  grouping task-grouping {
    description
      "Task-level metadata.";
    leaf task-id {
      type string;
      mandatory true;
      description
        "Unique identifier for the task within the job.";
    }
    leaf task-name {
      type string;
      description
        "Human-readable name of the task.";
    }
    leaf task-role {
      type identityref {
        base task-role;
      }
      description
        "Functional role of the task in the workload.";
    }
    leaf task-index {
      type uint32;
      description
        "Index or sequence number of the task.";
    }
  }

  grouping job-grouping {
    description
      "Job structure and task information.";
    leaf job-id {
      type string;
      description
        "Scheduler-specific job identifier.";
    }
    leaf job-name {
      type string;
      description
        "Human-readable job name.";
    }
    leaf job-array-id {
      type string;
      description
        "Job array identifier for array jobs.";
    }
    leaf job-size {
      type uint32;
      description
        "Total number of tasks or execution units in the job.";
    }
    list task {
      key "task-id";
      description
        "List of tasks comprising the job.";
      uses task-grouping;
    }
  }

  grouping timing-grouping {
    description
      "Timing and scheduling information.";
    leaf submit-time {
      type yang:date-and-time;
      description
        "Time when the workload was submitted to the scheduler.";
    }
    leaf earliest-start-time {
      type yang:date-and-time;
      description
        "Earliest time when the workload can start.";
    }
    leaf requested-start-time {
      type yang:date-and-time;
      description
        "Requested start time for the workload.";
    }
    leaf deadline {
      type yang:date-and-time;
      description
        "Deadline by which the workload should complete.";
    }
    leaf requested-duration {
      type uint32;
      description
        "Requested duration for the workload execution.";
    }
    leaf duration-unit {
      type identityref {
        base duration-unit;
      }
      description
        "Unit for the requested duration.";
    }
  }

  /*
   * Top-level container
   */
  container hpc-scheduler-job-metadata {
    description
      "Top-level container for HPC/AI scheduler job metadata.";

    container scheduler {
      description
        "Scheduler identification and metadata.";
      uses scheduler-grouping;
    }

    container submitter {
      description
        "Submitter and tenant context.";
      uses submitter-grouping;
    }

    container workload {
      description
        "Workload identification and metadata.";
      uses workload-grouping;
    }

    container job {
      description
        "Job structure and task information.";
      uses job-grouping;
    }

    container timing {
      description
        "Timing and scheduling information.";
      uses timing-grouping;
    }
  }
}

8. Security Considerations

Scheduler and job metadata can reveal user, tenant, project, workload, timing, and operational information. Implementations need to protect the confidentiality and integrity of this information and restrict access to authorized workload managers, controllers, orchestrators, and network management systems.

9. IANA Considerations

IANA is requested to register one URI in the "IETF XML Registry" [RFC3688]. Following the format in [RFC3688], the following registration is requested:


   URI: urn:ietf:params:xml:ns:yang:ietf-hpc-scheduler-job-metadata

   Registrant Contact: The IESG.

   XML: N/A; the requested URI is an XML namespace.

IANA is requested to register the following YANG module in the "YANG Module Names" registry [RFC6020].


    name: ietf-hpc-scheduler-job-metadata

    namespace: urn:ietf:params:xml:ns:yang:ietf-hpc-scheduler-job-metadata

    prefix: hpc-sched

    reference: RFC XXXX

10. Acknowledgements

The authors acknowledge the related HP-WAN framework and problem statement work that provides the broader context for this scheduler job metadata model.

11. References

11.1. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.

11.2. Informative References

[I-D.kcrh-hpwan-state-of-art]
King, D., Chown, T., Rapier, C., Huang, D., and K. Yao, "Current State of the Art for High Performance Wide Area Networks", Work in Progress, Internet-Draft, draft-kcrh-hpwan-state-of-art-03, , <https://datatracker.ietf.org/doc/html/draft-kcrh-hpwan-state-of-art-03>.
[I-D.kompella-rtgwg-mlnwsched]
Kompella, K., Beeram, V. P., Mahale, A., Bhargava, R., and N. Geyer, "Scheduling Network Resources for Machine Learning Clusters", Work in Progress, Internet-Draft, draft-kompella-rtgwg-mlnwsched-02, , <https://datatracker.ietf.org/doc/html/draft-kompella-rtgwg-mlnwsched-02>.
[I-D.xhy-hpwan-framework]
Xiong, Q., Huang, G., Yao, K., and C. Lin, "Framework for High Performance Wide Area Network (HP-WAN)", Work in Progress, Internet-Draft, draft-xhy-hpwan-framework-03, , <https://datatracker.ietf.org/doc/html/draft-xhy-hpwan-framework-03>.

Appendix A. Example

This section provides an example of scheduler job metadata for a distributed AI training workload. The example demonstrates how platform-specific job information from a Kubernetes-based AI orchestration system is mapped to the common metadata model.

Consider a scenario where a user submits a distributed training job using Kubeflow on a Kubernetes cluster. The job involves multiple worker nodes and parameter servers.


   {
     "ietf-hpc-scheduler-job-metadata:hpc-scheduler-job-metadata": {
     "scheduler": {
         "scheduler-id": "ai-orchestrator-1",
         "scheduler-name": "AI-Training-Orchestrator",
         "scheduler-type": "kubernetes",
         "platform-instance": "nvidia-base-command-2.0"
       },
       "submitter": {
         "tenant-id": "ai-research-lab",
         "project-id": "distributed-ml-project",
         "namespace": "ml-training",
         "user-id": "researcher-bob",
         "account-id": "project-alpha"
       },
       "workload": {
         "workload-id": "distributed-training-001",
         "workload-name": "large-scale-llm-training",
         "workload-type": "ai-training",
         "framework": "pytorch",
         "priority": 100,
         "queue": "gpu-high-priority",
         "correlation-id": "corr-ai-training-001"
       },
       "job": {
         "job-id": "job-2026-04-23-001",
         "job-name": "llm-13b-distributed",
         "job-size": 3,
         "task": [
           {
             "task-id": "worker-1",
             "task-name": "gpu-worker-west-1",
             "task-role": "worker",
             "task-index": 0
           },
           {
             "task-id": "worker-2",
             "task-name": "gpu-worker-west-2",
             "task-role": "worker",
             "task-index": 1
           },
           {
             "task-id": "worker-3",
             "task-name": "gpu-worker-east-1",
             "task-role": "worker",
             "task-index": 2
           }
         ]
       },
       "timing": {
         "submit-time": "2026-04-23T09:00:00Z",
         "earliest-start-time": "2026-04-23T09:45:00Z",
         "requested-start-time": "2026-04-23T10:00:00Z",
         "deadline": "2026-04-23T12:00:00Z",
         "requested-duration": 120,
         "duration-unit": "minutes"
       }
     }
   }

Authors' Addresses

Quan Xiong
ZTE Corporation
Kireeti Kompella
HPE
Daniel King
Lancaster University