Internet-Draft Resource Abstraction January 2025
Dunbar, et al. Expires 28 July 2025 [Page]
Workgroup:
NeoTec
Internet-Draft:
draft-dunbar-neotec-net-adjust-cloud-scaling-02
Updates:
8342 (if approved)
Published:
Intended Status:
Standards Track
Expires:
Authors:
L. Dunbar, Ed.
Futurewei
C. Xie
China Telecom
K. Majumdar
Oracle
B. Wu
Huawei
Q. Sun
China Telecom

Dynamic Network Adjustments for Cloud Service Scaling

Abstract

This document defines a framework for dynamically adjusting network-wide load balancing policies in response to cloud service scaling events, addressing key challenges faced by Telecom Cloud Service Providers (TCSPs). As cloud services scale, increase traffic, or relocate workloads across distributed edge and core clouds, network policies must adapt in real time to maintain optimal performance and ensure compliance with strict service level objectives (SLOs). Current manual network adjustments are often slow, error prone, and insufficient for the dynamic nature of cloud environments.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 28 July 2025.

Table of Contents

1. Introduction

Cloud services are increasingly dynamic, requiring real time network adjustments to accommodate fluctuating workloads and user demand. As services scale, whether due to increased traffic, new resource allocations, or expanded service delivery, traditional Equal Cost Multi Path (ECMP) and Unequal Cost Multi Path (UCMP) load balancing mechanisms fall short in adapting to these rapid changes. Static load balancing approaches lack the flexibility to account for real time shifts in service placement, resource availability, and traffic patterns, leading to suboptimal performance and inefficient resource utilization.

In Telecom Cloud environments, where multi vendor cloud aware orchestration systems and network controllers coexist, a standardized and interoperable approach is essential for dynamically adjusting network wide load balancing policies. This document proposes a framework that enables automated load balancing adaptations in response to cloud service scaling by integrating cloud orchestration with network management through standardized YANG models. By ensuring seamless interoperability across controllers by different vendors, this approach allows Telecom Cloud providers to optimize network wide load distribution dynamically, aligning network operations with evolving cloud service demands.

2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Problem Statement

As cloud services continue to scale dynamically, network infrastructure must adjust in real time to support changing service demands. However, in many Telecom Cloud operations, the coordination between cloud service scaling and network-wide load balancing adjustments remains largely manual or relies on proprietary solutions. This lack of automation and standardization introduces several key challenges:

- Limited Cloud Awareness in Network Control: Network controllers lack visibility into cloud resources and the status of hosted services. Without timely awareness of service function scaling, relocation, or shifting traffic patterns, the network cannot dynamically adjust load balancing policies, leading to suboptimal performance, SLA violations, and inefficient resource utilization.

- Delayed Load Balancing Adjustments: Traditional approaches are reactive and slow, resulting in traffic congestion, performance degradation, and service disruptions when cloud services scale or migrate.

- Fragmented and Proprietary Solutions: Existing solutions are often vendor-specific, creating inconsistencies in managing network-wide load balancing across different platforms and multi-cloud environments, reducing operational flexibility.

- Operational Complexity in Multi-Vendor Environments: The absence of standardized, vendor-agnostic interfaces makes it difficult to coordinate cloud-aware network operations, increasing the complexity of managing distributed infrastructures.

- Lack of a Standardized Framework: There is no industry-wide standard for automating real-time adjustments to network-wide load balancing policies in response to cloud service scaling, hindering the seamless, dynamic adaptation of network traffic distribution.

Addressing these challenges requires a unified framework that enables cloud-aware, automated load balancing adjustments, ensuring efficient, real-time coordination between cloud orchestration and network management.

4. Framework for Dynamic Network Adjustments

The framework for dynamic network adjustments in response to cloud service scaling consists of the following key elements:

- Cloud-Aware Network Load Balancing: enableing dynamic adjustments to network-wide load balancing policies in response to real-time cloud service scaling events. By incorporating cloud resource status, such as compute capacity, storage availability, and services functions, into network decision-making, it optimizes traffic distribution across the infrastructure. This ensures seamless traffic steering across distributed edge and core cloud environments, enhancing performance and resource utilization..

- Standardized Northbound Interface for Cloud aware Orchestrator to Network Controller Communication: establishing standardized APIs and YANG models to enable cloud-aware service orchestrators to seamlessly communicate network adjustment requests to network controllers. It facilitates real-time updates to network policies, ensuring automated and adaptive responses to dynamic cloud workloads.

- Real-Time Telemetry and Policy Enforcement: Leverage streaming telemetry (e.g., gRPC, NetConf, YANG Push) for continuous monitoring of cloud and network states. By implementing closed-loop automation, it dynamically refines load balancing policies to optimize resource utilization and adapt to changing service demands. Additionally, it ensures SLA compliance by proactively adjusting network paths and traffic flows based on live performance metrics and cloud service scaling events.

These core elements provide a structured approach to integrating cloud service awareness into network operations, enabling adaptive and efficient load balancing across Telecom Cloud environments.

5. Dynamic-Load-Balancer

As cloud services dynamically scale, network traffic distribution must be continuously optimized to maintain performance, resource efficiency, and service-level objectives (SLOs). Achieving this requires fine-grained control over routing policies across ingress, intermediate, and egress routers to accommodate shifting workloads, fluctuating network conditions, and evolving service requirements. The Dynamic Load Balancer YANG module provides a structured and standardized framework for dynamically modifying traffic distribution strategies, seamlessly integrating with automated orchestration systems and network controllers.

By leveraging real-time triggers, such as service instance scaling, network congestion, and latency-sensitive application demands, the module ensures that traffic is intelligently steered over the most suitable paths. Unlike traditional load balancing, this model supports multi-path UCMP policies, allowing traffic to be distributed across multiple available links instead of being confined to a single path. It enables adaptive weight allocation, ensuring that high-bandwidth AI/ML workloads can utilize a defined percentage of total available paths, while latency-sensitive services are dynamically mapped to low-latency circuits. Additionally, time-based UCMP policies allow traffic allocation to shift dynamically, ensuring that business-critical applications receive higher priority during peak hours, while non-critical flows are offloaded to secondary SD-WAN paths during off-peak times.

This adaptive, policy-driven approach enhances network resilience, prevents performance bottlenecks, and optimizes resource utilization across both network and cloud infrastructure. By introducing predictive AI-driven optimization, QoS-aware path selection, and real-time dynamic reallocation, the Dynamic Load Balancer YANG module empowers networks to intelligently adjust to demand fluctuations, delivering scalable, high-performance connectivity for modern cloud-based services.

5.1. Examplary Scenarios

- Scenario 1: AI/ML Training Traffic Using 40% of All Available Paths

In a network connecting AI/ML compute clusters between Site A and Site B, large-scale distributed training workloads require high-bandwidth paths with minimal packet loss. Instead of allocating AI/ML traffic to a single 100G path, a dynamic UCMP policy ensures that 40% of all available paths (including multiple 100G links) are used to distribute the AI training flows. The policy ensures that at least two paths are always available for AI/ML traffic, but the system can scale up to a maximum of five paths if required. This allocation helps in balancing the load across multiple paths, preventing congestion and avoiding bottlenecks on any single path.

- Scenario 2: VoIP Traffic Always Uses at Least Two Low Latency Paths

For real-time VoIP services, ensuring low latency and jitter is critical. A UCMP policy is configured to allocate 20% of the total available paths to VoIP traffic, ensuring that calls always have dedicated, high-priority routing. The policy enforces a minimum of two paths, even if VoIP traffic volume is low, to provide redundancy in case of failure. To further improve performance, preferred paths are selected from MPLS private links instead of standard internet-based paths. This ensures consistent quality of service (QoS) for VoIP calls, preventing drops and degradation.

- Scenario 3: Dynamic Path Allocations for Peak and Off-Peak Traffic

A time-sensitive UCMP policy dynamically adjusts path allocations based on the network's daily traffic patterns. During business hours (09:00 to 17:00), the system allocates 50% of the total available paths to enterprise applications such as video conferencing and cloud-based business tools to maintain service-level agreements (SLAs). However, during evening hours (17:00 to 23:00), streaming services like video-on-demand and gaming take priority, increasing their path allocation to 70%. Finally, during late-night hours (23:00 to 06:00), when AI/ML clusters perform large-scale training jobs, the system shifts its allocation so that 80% of all available paths are reserved for data-intensive AI workloads. This ensures optimal utilization of network capacity without overloading any single set of paths.

5.2. YANG Model of the Dynamic Load Balancer

The following is the TREE format representation of the Dynamic Load Balancer YANG model, which provides a structured framework for dynamically adjusting routing policies across network elements. It enables the network controller to optimize path selection for ingress, intermediate, and egress routers based on real time service demands and network conditions. Supporting ECMP, UCMP, latency aware and bandwidth aware routing, and dynamic traffic steering, the model ensures precise traffic distribution for high-priority services, AI ML workloads, real time applications, and enterprise-critical traffic. The TREE representation below illustrates its hierarchical structure, detailing how policies are defined, traffic classes categorized, and network adjustments orchestrated dynamically.

module: dynamic-load-balancer
  +--rw load-balancer
     +--rw policy* [policy-id]
        +--rw policy-id          string
        +--rw service-group      string
        +--rw traffic-class      enumeration
        |     +-- real-time-services   "Prioritize low-latency paths for VoIP, streaming, and real-time applications."
        |     +-- ai-ml-flows          "Optimize for high-bandwidth AI/ML training workloads."
        |     +-- business-critical    "Ensure SLA compliance for enterprise applications."
        |     +-- best-effort          "Use available capacity without strict SLAs."
        +--rw path-adjustment
        |     +--rw adjustment-mode    enumeration
        |     |     +-- ecmp              "Use Equal-Cost Multi-Path (ECMP) routing."
        |     |     +-- ucmp              "Use Unequal-Cost Multi-Path (UCMP) routing."
        |     |     +-- latency-aware     "Prioritize low-latency circuits (e.g., private lines) for selected services."
        |     |     +-- bandwidth-aware   "Prefer high-bandwidth paths for high-throughput applications."
        |     |     +-- traffic-steering  "Redirect specific services based on congestion and resource availability."
        |     +--rw preferred-paths*      string
        |     +--rw ucmp-policy
        |        +--rw flow-ucmp-allocation* [flow-id]
        |        |     +--rw flow-id          string
        |        |     +--rw match-type       enumeration
        |        |         +-- ip-prefix      "Match by IP prefix."
        |        |         +-- dscp-value     "Match by DSCP marking."
        |        |         +-- application    "Match by application type (e.g., VoIP, AI, Video)."
        |        |     +--rw assigned-paths* [path-id]
        |        |         +--rw path-id       string
        |        |         +--rw percentage    uint8
        |        |             units "percent"
        |        |             description "Percentage of traffic assigned to this path under UCMP for this flow."
        |        |         +--rw min-bandwidth uint32
        |        |             units "Mbps"
        |        |             description "Minimum bandwidth required for this path to be used."
        |        |         +--rw max-utilization uint8
        |        |             units "percentage"
        |        |             description "Avoid using this path if utilization exceeds this threshold."
        |     +--rw total-path-allocation* [allocation-id]
        |        +--rw allocation-id     string
        |        +--rw allocated-flows*  string
        |            description "Set of flow IDs assigned to this total allocation policy."
        |        +--rw total-percentage  uint8
        |            units "percent"
        |            description "Total percentage of available paths that should be used for these flows."
        |        +--rw min-paths         uint8
        |            description "Minimum number of paths to allocate for the flows."
        |        +--rw max-paths         uint8
        |            description "Maximum number of paths to allocate for the flows."
        |        +--rw preferred-links*  string
        |            description "Preferred paths to be used, if available."
        |        +--rw time-based-allocation
        |            +--rw start-time    string
        |                units "HH:MM"
        |                description "Start time for this allocation policy."
        |            +--rw end-time      string
        |                units "HH:MM"
        |                description "End time for this allocation policy."
        |            +--rw active-days*  enumeration
        |                +-- monday      "Apply policy on Monday."
        |                +-- tuesday     "Apply policy on Tuesday."
        |                +-- wednesday   "Apply policy on Wednesday."
        |                +-- thursday    "Apply policy on Thursday."
        |                +-- friday      "Apply policy on Friday."
        |                +-- saturday    "Apply policy on Saturday."
        |                +-- sunday      "Apply policy on Sunday."
        +--rw policy-duration        uint32
        |     units "seconds"
        |     description "Time duration (in seconds) for which this policy should be enforced."
        +--rw affected-routers* [router-id]
              +--rw router-id        string
              +--rw role             enumeration
              |     +-- ingress       "Ingress routers where traffic enters the network."
              |     +-- intermediate  "Intermediate routers managing transit paths."
              |     +-- egress        "Egress routers where traffic exits towards destinations."


5.3. Utilizing the YANG Module

A cloud aware orchestrator plays a critical role in dynamically adjusting network wide load balancing policies in response to real time changes in cloud services, service demands, and network conditions. By leveraging the Dynamic Load Balancer YANG module, the orchestrator can systematically monitor service instances, cloud resource availability, and traffic patterns, triggering network adjustments to maintain optimal performance, efficiency, and SLA compliance. The module's structured data model enables seamless integration with network controllers, allowing policies to be dynamically updated using standardized JSON based configurations.

The cloud aware orchestrator continuously collects and analyzes telemetry data from both cloud resources (e.g., compute utilization, service instance scaling, inter cloud bandwidth) and network state (e.g., latency, congestion, path availability). When a significant event occurs, such as an AI/ML workload scaling across multiple data centers, an enterprise application requiring higher reliability, or congestion affecting real time services, the orchestrator automatically generates a JSON based policy update using the Dynamic Load Balancer YANG module. This update specifies UCMP weight adjustments, dynamic path allocations, and time sensitive routing policies, ensuring that traffic is efficiently distributed across the most suitable network paths.

By using JSON based interactions, the orchestrator enables real time policy enforcement, ensuring high adaptability to cloud and network dynamics. This approach enhances network resilience, prevents performance degradation, and maximizes resource utilization across both cloud and network infrastructure.

5.3.1. JSON Example

Here is a JSON representation of Scenario 1: AI/ML Training Traffic Using 40% of All Available Paths.

{
  "dynamic-load-balancer": {
    "policy": [
      {
        "policy-id": "AI-ML-Training",
        "service-group": "AI-ML-Workloads",
        "traffic-class": "ai-ml-flows",
        "path-adjustment": {
          "adjustment-mode": "ucmp",
          "ucmp-policy": {
            "total-path-allocation": [
              {
                "allocation-id": "AI-Training-Paths",
                "allocated-flows": ["AI-ML-Cluster-1", "AI-ML-Cluster-2"],
                "total-percentage": 40,
                "min-paths": 2,
                "max-paths": 5,
                "preferred-links": ["Path-A", "Path-B", "Path-C"]
              }
            ]
          }
        },
        "policy-duration": 86400
      }
    ]
  }
}

The AI/ML training UCMP policy optimizes network resource allocation by reserving 40% of all available paths for high bandwidth AI/ML workloads, ensuring efficient data transfer for distributed computing tasks. To maintain scalability and resilience, the policy guarantees a minimum of two paths at all times, with the flexibility to expand up to five paths dynamically as traffic demands increase. It prioritizes high-bandwidth routes such as Path-A, Path-B, and Path-C, optimizing throughput for large scale data processing across data centers. This policy remains active for 24 hours (86400 seconds) before being refreshed, allowing for continuous adaptation to evolving AI/ML workload requirements while maximizing network efficiency and performance.

- Here is a JSON representation of Scenario 2: VoIP Traffic Always Uses at Least Two Low-Latency Paths:

{
  "dynamic-load-balancer": {
    "policy": [
      {
        "policy-id": "VoIP-Low-Latency",
        "service-group": "Real-Time-Services",
        "traffic-class": "real-time-services",
        "path-adjustment": {
          "adjustment-mode": "ucmp",
          "ucmp-policy": {
            "total-path-allocation": [
              {
                "allocation-id": "VoIP-Priority-Paths",
                "allocated-flows": ["VoIP-Traffic"],
                "total-percentage": 20,
                "min-paths": 2,
                "max-paths": 3,
                "preferred-links": ["MPLS-Link-1", "MPLS-Link-2"]
              }
            ]
          }
        },
        "policy-duration": 86400
      }
    ]
  }
}


The VoIP traffic UCMP policy ensures consistent low-latency routing by allocating 20% of the total available paths exclusively for VoIP services, maintaining high call quality and reliability. To enhance redundancy, the policy guarantees that at least two paths are always available for VoIP traffic, with the flexibility to scale up to three paths if network conditions require additional capacity. By prioritizing MPLS links, the policy effectively minimizes latency and jitter, providing a stable and high-quality experience for real-time voice communication. This policy remains active for 24 hours (86400 seconds) before re-evaluation, ensuring continuous performance optimization based on real-time network conditions.

- Here is a JSON representation of Scenario 3: Dynamic Path Allocations for Peak and Off Peak Traffic:

        {
  "dynamic-load-balancer": {
    "policy": [
      {
        "policy-id": "Time-Based-Allocations",
        "service-group": "Time-Based-Priorities",
        "traffic-class": "business-critical",
        "path-adjustment": {
          "adjustment-mode": "ucmp",
          "ucmp-policy": {
            "time-based-allocation": [
              {
                "start-time": "09:00",
                "end-time": "17:00",
                "active-days": ["monday", "tuesday", "wednesday", "thursday", "friday"],
                "total-percentage": 50,
                "allocated-flows": ["Enterprise-Apps"]
              },
              {
                "start-time": "17:00",
                "end-time": "23:00",
                "active-days": ["monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"],
                "total-percentage": 70,
                "allocated-flows": ["Streaming-Traffic"]
              },
              {
                "start-time": "23:00",
                "end-time": "06:00",
                "active-days": ["monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"],
                "total-percentage": 80,
                "allocated-flows": ["AI-ML-Training"]
              }
            ]
          }
        },
        "policy-duration": 604800
      }
    ]
  }
}

The time based UCMP policy dynamically adjusts network capacity allocation throughout the day to optimize traffic distribution based on real time demand. During business hours (09:00-17:00), 50% of the total available paths are allocated to enterprise applications, ensuring reliable connectivity for cloud business tools, video conferencing, and mission critical workloads. As network usage patterns shift in the evening (17:00 - 23:00), the policy reallocates 70% of paths to support streaming traffic, prioritizing video on demand services and online gaming that experience peak demand during this period. Finally, in the late night hours (23:00 - 06:00), when network demand from interactive applications decreases, 80% of total paths are dedicated to AI/ML training workloads, enabling efficient processing of large scale datasets across data centers. This policy remains valid for one week (604800 seconds) before refreshing, ensuring continuous adaptation to changing traffic patterns while maximizing network performance and resource efficiency.

6. Security Considerations

Security is a critical aspect when automating network adjustments in response to cloud service scaling. Several key areas should be addressed:

- Authentication and Authorization:

Use mutual authentication methods such as TLS certificates to verify the identities of both the cloud orchestrator and the network controller before any configuration commands are accepted.

OAuth or API Key-Based Access: For REST API-based communications, secure token-based authentication (e.g., OAuth 2.0) or unique API keys can be employed to validate requests from legitimate sources.

- Data Integrity:

Use TLS to encrypt communication channels, protecting the integrity of the transmitted data.

Employ checksums or hash functions on critical configuration messages to detect any tampering or unintended modifications during transit.

- Monitoring and Auditing:

Maintain detailed logs of all configuration changes initiated by cloud scaling events, including timestamps, source entities, and specific parameters modified.

Conduct periodic audits of the authorization policies, access logs, and configuration adjustments to ensure compliance with security policies and to detect any anomalies.

7. IANA Considerations

IANA is requested to register the YANG module namespaces for dynamic-bandwidth, dynamic-load-balancer, and dynamic-acl under the "YANG Module Names" registry at https://www.iana.org/assignments/yang-parameters. These namespaces should be registered as follows:

YANG Module           Namespace URI               Prefix   reference
------------          --------------              ------   ----------
dynamic-bandwidth     urn:ietf:params:xml:ns:yang  dbw   this document
dynamic-load-balancer urn:ietf:params:xml:ns:yang  dlb   this document
dynamic-acl           urn:ietf:params:xml:ns:yang  dacl  this document

8. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[RFC8345]
Clemm, A., Medved, J., Varga, R., Bahadur, N., Ananthakrishnan, H., and X. Liu, "A YANG Data Model for Network Topologies", RFC 8345, DOI 10.17487/RFC8345, , <https://www.rfc-editor.org/info/rfc8345>.

Acknowledgements

The authors would like to thank for following for discussions and providing input to this document: xxx.

Contributors

Authors' Addresses

Linda Dunbar (editor)
Futurewei
United States of America
ChongFeng Xie
China Telecom
China
Kausik Majumdar
Oracle
United States of America
Wu Bo
Huawei
China
Qiang Sun
China Telecom
China