CAICT Z. Li INTERNET DRAFT Y. Su Intended status: Standards Track J. Dou Expires: 30 April 2025 R.Chen CAICT 16 October 2024 A method for evaluating the capabilities of large language models deployed on hybrid cloud draft-lizihan-hybrid-cloudlargelanguagemodel-00 Abstract This document establishes Group Standard for Large Language Model capabilities on hybrid cloud. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 30 April 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Table of contents 1. Introduction 2. Capability Overview 3. Cloud Infrastructural Capabilities 4. Model Capabilities 5. Application Capabilities 6. Security Considerations 7. Operation and Maintenance Capabilities 8. Completeness of Service Considerations 9. IANA Considerations 10.References Acknowledgments Authors Addresses 1.Introduction This document stipulates the maturity of large model hybrid cloud capability, including Cloud Infrastructural capabilities, model capabilities, application capabilities, model security considerations, and the completeness and standardization of service indicators. This document intends to standardize products or solutions in the field of large model hybrid cloud, and to help enterprises to better build and improve the construction of their own large model hybrid cloud. Large model hybrid cloud is now a focus for many service providers and enterprise customers. Since its emergence, the concept of hybrid cloud has experienced the integration with multiple technologies. Hybrid cloud is now a common form of enterprise tech architecture. 82% of global enterprises have adopted hybrid cloud architecture. Among them, with the continuous breakthrough of artificial intelligence technology, large model hybrid cloud architecture has become a key force for enterprises to promote technological innovation and optimize data processing. Companies around the world are now embracing this architecture to address the growing need for computing and data management. In the face of these challenges, the large model hybrid cloud demonstrates powerful core capabilities, including model scalability, adaptive learning capabilities, and cross-platform collaboration capabilities. 2. Capability Overview 2.1 Technical architecture of large model hybrid cloud capability maturity The large model hybrid cloud tech architecture describes its components and their relationships: Cloud Infrastructure layer: Provides and manages computing resources for the system; Model layer: Supports the full development process of large models; Application layer: Develops or adjusts large models for specific industry or scenario needs; Security capability: Consists of hybrid cloud and model security; Operation and maintenance capability: Includes disaster recovery, monitoring, and log management. 3 Cloud Infrastructure Capabilities Considerations 3.1 Overview Large model hybrid cloud capability refers to the ability to provide and manage hybrid cloud computing resources. The following indicators are defined: - Multiple computing power compatibility; - Storage facilities network capacity; - Multi-cloud access capability; - Multi-cloud resource management capability. 3.2 Multiple computing power compatibility Level 1: Supports at least one CPU (x86, ARM, etc.) and GPU architecture; Level 2: Supports various CPU (x86, ARM, etc.) and GPU architectures; Level 3: Supports various CPU (x86, ARM, etc.), GPU, and NPU or DPU architectures. 3.3 Storage capacity Level 1: Supports distributed storage in the cloud and local storage systems; Level 2: Supports object storage, parallel file access, and lifecycle management; Level 3: Supports high-performance object storage and distributed parallel file storage for AI. 3.4 Network capability Level 1: Supports stable networks through dedicated lines, VPN, or SD-WAN, and basic QoS; Level 2: Tests network performance and supports visual network monitoring; Level 3: Supports RDMA and single-port high network bandwidth. 3.5 Cloud access capability Level 1: Supports access to at least one public cloud, private cloud, or local environment; Level 2: Supports access to multiple public clouds, private clouds, and local environments; Level 3: Supports access to multiple public clouds, private clouds, local environments, proprietary clouds, and edge nodes. 3.6 Multi-cloud resource management capability Level 1: Supports management of computing resources, including allocation and lifecycle management; Level 2: Supports load balancing and unified resource pooling; Level 3: Supports visual arrangement and computing power cutting of GPU or NPU. 4 Model layer ability 4.1 Overview The model layer capability refers to the ability of large models to develop fully on the hybrid cloud, with the following indicators defined: - Data engineering capability; - Model development or training capability; - Model deployment or reasoning capability. 4.2 Data engineering capability 4.2.1 Overview Data engineering capability refers to the ability to process data required for model training or reasoning on the hybrid cloud. The following indicators are defined: - Data access capability; - Data processing capability; - Data management capabilities. 4.2.2 Data access capability Level 1: Supports access to structured, semi-structured, and unstructured data, and multi-modal data; Level 2: Supports identification and access of data from different sources and formats; Level 3: Supports incremental data synchronization and custom data access filtering strategies. 4.2.3 Data processing capability Level 1: Supports data cleaning, completion, and annotation; Level 2: Supports data standardization, enhancement, and both offline and online processing; Level 3: Supports intelligent cleaning, multi-modal data annotation, and various annotation methods. 4.2.4 Data management capability Level 1: Supports metadata management, data set construction, and version management; Level 2: Supports data lifecycle management, classification, and migration between multi-cloud environments; Level 3: Supports multi-dimensional analysis and construction of new data sets. 4.3 Model development or training ability 4.3.1 Overview Model development or training capability refers to the ability to conduct model development or training on the hybrid cloud, with the following indicators defined: - Model development or training environmental capabilities; - Training task arrangement ability; - Model evaluation capability. 4.3.2 Model development or training of environmental competencies Level 1: Supports deep learning frameworks, heterogeneous computing frameworks, and distributed training frameworks; Level 2: Supports preset model libraries and algorithm libraries. 4.3.3 Training task arrangement ability Level 1: Supports regular model hyperparameter setup and visual display of training tasks; Level 2: Supports various distributed training methods and visual arrangement of model training tasks; Level 3: Supports various model tuning strategies and targeted training of abnormal data. 4.3.4 Ability of model evaluation Level 1: Supports multi-dimensional model evaluation and custom business-related indicators; Level 2: Supports visual comparison of different model evaluation results and generation of evaluation reports. 4.4 Model deployment or reasoning capability 4.4.1 Overview Model deployment or reasoning capability refers to the ability to deploy or reason on a hybrid cloud, with the following indicators defined: - Multivariate deployment or reasoning capabilities; - Model management capability. 4.4.2 Multivariate deployment or reasoning capability Level 1: Supports mirrored model deployment and service interfaces; Level 2: Supports diversified model deployment strategies and multiple hybrid cloud environments; Level 3: Supports cloud-edge collaboration strategies and model compression for edge deployment. 4.4.3 Model management capability Level 1: Supports deployment of multiple models or versions and viewing of deployed model information; Level 2: Supports lifecycle management of deployed models, version management, and multiple model file storage formats. 5 Application layer ability 5.1 Overview Application layer capability refers to the ability of large models to meet personalized needs of different industries or scenarios, with the following indicators defined: - Model application scenario support capability. 5.2 Support capability of model application scenarios Level 1: Supports at least one basic large model, one industry model, and three large scene models; Level 2: Supports at least two basic large models, two industry models, and five large scene models; Level 3: Supports at least three basic large models, three industry models, and ten large scene models. 6 Security Considerations 6.1 Overview Security capability in large model hybrid cloud capability refers to the comprehensive security capability, with the following indicators defined: - Hybrid cloud security capabilities; - Model security capability. 6.2 Hybrid cloud security capabilities 6.2.1 Overview Hybrid cloud security capability refers to the comprehensive security capability of hybrid cloud, with the following indicators defined: - Access control capability; - Data security capability; - Network security capability. 6.2.2 Access control capability Level 1: Supports user account management, password modification, and identity authentication; Level 2: Supports role-based access control and user login settings; Level 3: Supports multi-factor authentication and connection with independent user authentication systems. 6.2.3 Data security capability Level 1: Protects data storage and transmission through encryption, key management, and database firewalls; Level 2: Supports data differentiation, classification, and various access strategies; Level 3: Supports data domain management, encryption, and desensitization for data sharing. 6.2.4 Network security capability Level 1: Ensures network boundary security with security groups, firewalls, IDS, and secure transmission protocols like HTTPS and TLS; Level 2: Supports network isolation through software-defined networks and defends against traffic and application layer attacks; Level 3: Supports simulated model attacks, identification of model "illusions," and inspection of model generation results for high-risk content. 6.3 Model security capability 6.3.1 Overview Model security capability refers to the comprehensive safety capability of large models, with the following indicators defined: - Access control capability; - Model service security capability. 6.3.2 Access control capability Level 1: Supports identity authorization management and authentication for large model access; Level 2: Supports access control strategies and diversified identity authentication functions. 6.3.3 Model service security capability Level 1: Supports data review for model training, model file integrity verification, and encryption; Level 2: Adopts security algorithms and protocols for model training and has an emergency response mechanism for model services; Level 3: Supports simulated model attacks and identification and analysis of model "illusions." 7 Operation and Maintenance Capabilities Requirements 7.1 Overview Operation and maintenance capability refers to the comprehensive operation and maintenance capability of the large model hybrid cloud system, with the following indicators defined: - Disaster recovery backup; - Metering pay; - Monitoring alarm; - Log Management. 7.2 Disaster recovery and backup capability Level 1: Supports disaster recovery task management, data backup, and consistency verification; Level 2: Supports model state saving and offline model availability in case of network failure; Level 3: Supports system disaster recovery and disaster recovery drills. 7.3 Measurement and billing capacity Level 1: Provides metering and billing rules for large model services covering various scenarios; Level 2: Supports centralized display of billing records and multi-dimensional bill queries; Level 3: Supports cost estimation, analysis, and custom billing models. 7.4 Monitoring and alarm capability Level 1: Supports monitoring of computing resources, data access, model training, and model service stability; Level 2: Supports fault simulation, customized alarm thresholds, log monitoring, and heartbeat detection; Level 3: Supports unified alarm information collection, automated noise reduction, and fast fault location across hardware and software. 7.5 Log management capability Level 1: Supports logging of hybrid cloud access operations, large model access audits, and resource logging; Level 2: Supports log viewing, management, download, synchronization, and cleaning; Level 3: Supports recording of network attacks and querying of log analysis system data. 8. Completeness of service indicators Considerations 8.1 Product Cycle Large model hybrid cloud solutions or products must commit to delivery times and provide upgrade service evaluation methods based on material reviews, including product delivery cycles and update disclosures. 8.2 Operation and maintenance services Large model hybrid cloud solutions or products must describe assisted or managed operation services, including service hours, training services, and operation and maintenance assistance. 8.3 Protection of rights and interests Large model hybrid cloud solutions or products must promise user rights protection and risk control methods in service agreements, including continuous service durations, service fees, and privacy protection rules. 9. IANA Considerations To be completed. 10.References 10.1 Normative references to the reference documents The content of the following documents constitute essential provisions in this document. For dated references, only the version corresponding to that date applies to this document. For undated referenced documents, the latest version (including all modification orders) applies to this document. GB/T 32400-2015 Information Technology Cloud Computing Overview and vocabulary GB/T 41867-2022 Information technology AI terminology 10.2 Terms, definitions, and abbreviations GB/T 32400-2015, GB/ T 41867-2022 as defined and the following terms, definitions and abbreviations apply to this document. 10.2.1 Terms and definitions 10.2.2 Public cloud : public cloud A cloud deployment model in which cloud services can be used by any cloud service customer and resources are controlled by a cloud service provider. [ GB/T 32400-2015 3.2.33] 10.2.2 Private cloud private cloud A type of cloud deployment type that is used only by one cloud service customer and resources are controlled by that cloud service customer. [ GB/T 32400-2015 3.2.32] 10.2.3 Hybrid cloud hybrid cloud Cloud deployment models that contain at least two different cloud deployment models. [ GB/T 32400-2015 3.2.23] 10.2.4 Artificial Intelligence artificial Intelligence Research and development of related mechanisms and applications of artificial intelligence systems. [Source: GB/ T 41867-2022,3. 1.2] 10.2.5 Deep learning deep learning Methods to create rich hierarchical representations by training neural networks with many hidden layers. [Source: GB/ T 41867-2022,3.2.27] 10.2.6 Model Assessment of the model evaluation The quality of the trained model is evaluated through the established evaluation indexes of various AI tasks. 10.2.7 Sensitive data Refers to the data recorded such as personal information, enterprise information and government departments information in the computer information system that is not suitable for public release. Acknowledgments we are grateful to the authors of those documents for putting their time and effort into this. Zihan Li China Academy of Information and Communications Technology Yue Su Ruihao Chen Jiali Dou Authors' Addresses Zihan Li (editor) China Academy of Information and Communications Technology Zhichunlu Road Beijing China Email: lizihan1@caict.ac.cn Yue Su China Academy of Information and Communications Technology Email: suyue1@caict.ac.cn Ruihao Chen China Academy of Information and Communications Technology Email: chenruihao@caict.ac.cn Jiali Dou China Academy of Information and Communications Technology Email: doujiali@caict.ac.cn