Deterministic Networking(DetNet) Group                           N. Tang
Internet-Draft                                                  W. Zhang
Intended status: Informational               Beijing Jiaotong University
Expires: 9 January 2025                                      8 July 2024


   Resource orchestration and scheduling of Industrial Deterministic
                         Network and Computing
            draft-tang-detnet-network-resource-scheduling-00

Abstract

   Massive data processing and complex algorithm applications in the
   industrial Internet require a large amount of computing resources.
   At the same time, real-time control requirements and production
   safety requirements require network reliability and certainty.  This
   draft proposes a service-oriented task process processing framework,
   which divides the execution process of services into two stages,
   namely resource orchestration of task flow and packet transmission
   scheduling.  In order to obtain the optimal scheduling strategy, a
   constrained optimization problem is developed, which aims to maximize
   the success rate of transmission scheduling while compromising load
   balancing and resource utilization.  In order to improve the
   reliability of the network, the TSN-5G converged network architecture
   is used to transmit data packets.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 9 January 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.


Tang & Zhang             Expires 9 January 2025                 [Page 1]

Internet-Draft          Deterministic Networking               July 2024


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Network Architecture  . . . . . . . . . . . . . . . . . . . .   2
   2.  Resource Orchestration Mechanism  . . . . . . . . . . . . . .   2
     2.1.  Traffic Model . . . . . . . . . . . . . . . . . . . . . .   3
     2.2.  Resource Orchestration Model  . . . . . . . . . . . . . .   3
   3.  Deterministic Transmission Scheduling Mechanism . . . . . . .   3
     3.1.  Transmission Network Model  . . . . . . . . . . . . . . .   3
     3.2.  Time Slot Resources Scheduling Mechanism  . . . . . . . .   3
     3.3.  Design of Constraints . . . . . . . . . . . . . . . . . .   4
   4.  Resource Orchestration and Scheduling Algorithm . . . . . . .   5
     4.1.  Cross-domain Computing Resource Orchestration
           Algorithm . . . . . . . . . . . . . . . . . . . . . . . .   5
     4.2.  Deterministic Transmission Scheduling Algorithm . . . . .   6
     4.3.  Relationship Between Orchestration and Scheduling
           Algorithm . . . . . . . . . . . . . . . . . . . . . . . .   6
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
   8.  Informative References  . . . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   8

1.  Network Architecture

   Centralized control and distributed management are adopted to connect
   and coordinate the scheduling of geographically dispersed computing
   resources, including CPU, GPU, and storage resources.  There are two
   ways to connect each computing power domain, one is to connect
   through TSN system; The second is to connect through the TSN-5G
   converged network.  The DRL algorithm is deployed on the central
   controller for resource orchestration and transmission scheduling
   decisions. when, and only when, they appear in all capitals, as shown
   here.

2.  Resource Orchestration Mechanism


Tang & Zhang             Expires 9 January 2025                 [Page 2]

Internet-Draft          Deterministic Networking               July 2024


2.1.  Traffic Model

   In this draft, computing tasks are divided into three levels, namely
   service, task flow and data packet.  By classifying the task flows
   generated by different services according to the requirements based
   on the main resource requirements, different types of task flows can
   be better matched with the appropriate resources to achieve the on-
   demand adaptation of resources, which is conducive to the efficient
   management and utilization of multi-dimensional resources.  The
   resource orchestration part of the decision variable is the
   destination function computing domain address of the task flow.

2.2.  Resource Orchestration Model

   In order to cope with unexpected tasks and ensure system stability
   and user experience, computing and storage resources are reserved
   statistically in the model design to deal with unexpected tasks, and
   the remaining computing and storage resources are utilized by task
   flow.

   The constraints on this part are as follows: Ensure that the reserved
   resources and the consumed resources cannot exceed the resource
   capacity of the corresponding computing domain; Ensure that each task
   flow in a window can only be assigned to one computing domain for
   processing.  The objective function can be obtained by considering
   the load balancing and resource utilization in each computing domain.

3.  Deterministic Transmission Scheduling Mechanism

3.1.  Transmission Network Model

   In the transmission scheduling phase, a wireless and wired converged
   network framework supporting deterministic transmission is designed.
   Each computing domain server is connected through the 5G system and
   the TSN system, and the deterministic transmission between each
   computing domain is realized through the deterministic mechanism and
   constraints, wherein the TSN system is connected to the 5G system
   through two interfaces DS-TT and NW-TT.

3.2.  Time Slot Resources Scheduling Mechanism

   The transmission scheduling part determines the transmission path and
   time slot of the task flow packets.  According to[IEEE802.1Qbv],
   predictable finite delays can be provided by precisely controlling
   the forward queue of packets.  Among them, in order to realize the
   joint scheduling of time slot resources in TSN-5G converged network,
   the following three measures are adopted:


Tang & Zhang             Expires 9 January 2025                 [Page 3]

Internet-Draft          Deterministic Networking               July 2024


   &#8226 Adopt mini timeslot: The time slot of TSN is in the hundreds
   of microseconds, while that of 5G is in the millisecond level.  In
   order to realize the time synchronization of TSN system and 5G
   system, mini time slot is used as the time scheduling unit of 5G
   network.  The duration of mini time slot is very short (such as
   100-500 microseconds), and the boundary of mini time slot is
   consistent with the time synchronization period of TSN network.
   Achieve seamless connection between the two networks.

   &#8226 Uniform time slot length: To apply CQF mechanism to TSN-5G
   converged network, the key is to design timeslot length reasonably.
   The packets sent by upstream switch in the previous timeslot must be
   received by downstream switch in the next timeslot.  According to the
   network architecture adopted in this draft, there are three cases of
   transmission between any two hops, TSN to TSN, TSN to 5G and 5G to
   TSN, so the minimum time slot should be greater than the transmission
   time between any hop in these three cases, and at the same time, The
   slot size should be the greatest common divisor of all packet cycles.

   &#8226 Consider 5GS as a logical TSN bridge: In the packet
   transmission scheduling, the whole 5GS is regarded as a logical
   network bridge, the forwarding delay inside 5GS is less than or equal
   to the slot T, and the interface DS-TT and NW-TT are regarded as the
   receiving and sending queues of the CQF queue.  Therefore, the end-
   to-end delay that may be experienced can be obtained using the delay
   calculation formula of CQF.

   Through these three measures, the unified and joint scheduling of
   time slot resources is realized.

3.3.  Design of Constraints

   In order to realize reliable transmission scheduling, the following
   constraint functions are designed：

   &#8226 Data packet transmission: To ensure the smooth transmission of
   data packets, the total number of data packets in each queue in a
   time slot cannot exceed the maximum capacity of the queue.


Tang & Zhang             Expires 9 January 2025                 [Page 4]

Internet-Draft          Deterministic Networking               July 2024


   &#8226 Two-scale interaction: The transfer scheduling phase needs to
   take into account the output of the resource orchestration phase to
   ensure that tasks are assigned to the correct servers for
   computation.  In order to ensure that the packets of the second stage
   task flow are transmitted to the server that has been selected for
   the resource orchestration decision of the first stage, a ternary
   binary variable is added to constrain this relationship when modeling
   the problem.  For a task flow, the value of the ternary variable is 1
   only when the packets of the second stage are transmitted to the
   decision server of the first stage, otherwise it is 0.

   If the delay meets the requirements, the scheduling succeeds;
   otherwise, the scheduling fails.  The objective function of this part
   can be obtained by aiming at maximizing the scheduling success rate.

   The overall objective function can be obtained by combining the
   objective function of the two stages.

4.  Resource Orchestration and Scheduling Algorithm

   The global objective function is a multi-objective optimization
   problem, which is decoupled into a resource scheduling problem on a
   large time scale and a transmission scheduling problem on a small
   time scale, and then a two-layer constraint reinforcement learning
   algorithm is proposed to solve this problem.

4.1.  Cross-domain Computing Resource Orchestration Algorithm

   The cross-domain resource orchestration subproblem is to maximize the
   overall resource utilization by optimizing resource orchestration
   decisions.

   This draft gives a resource orchestration process based on greedy
   algorithm.  Because of its efficiency and simplicity, greedy
   algorithm can often give a relatively good approximate solution,
   especially when the number of tasks is large enough, as follows:

   &#8226 Sort and initialize: The task flows are sorted according to
   their resource requirements and tolerance times.  Consider
   prioritizing task flows with high demand and low tolerance time.

   &#8226 Iterate over all tasks: Start with a high-priority task and
   traverse all computing domains to find the optimal domain that meets
   the resource requirements for that task.

   &#8226 Improve resource utilization: In the domains that meet the
   requirements, select the domain with the lowest resource utilization
   to allocate resources to maximize resource utilization.


Tang & Zhang             Expires 9 January 2025                 [Page 5]

Internet-Draft          Deterministic Networking               July 2024


   The input of the subroutine is the available resources in all
   computing domains and the resource requirements of all computing
   tasks, and the output is a 2D 01 resource scheduling decision matrix
   DR.  Through this subroutine, resource utilization can be improved
   while load balancing is achieved.

4.2.  Deterministic Transmission Scheduling Algorithm

   The deterministic transmission scheduling subproblem is to allocate
   time slot resources with the goal of maximizing the successful
   scheduling rate.  This is an MDP problem because each packet
   transmission situation of a future task flow depends only on the
   current remaining packet volume and remaining delay state, and has
   nothing to do with the transmission history of previous packets.
   Three important factors are as follows:

   &#8226 Status: The central controller collects service information
   and remaining slot capacity information from the compute domain
   server and TSN switch.

   &#8226 Action: Based on the observed state, the agent can make real-
   time transmission scheduling policies to determine which time slot is
   arranged to transmit a packet of a task stream, thus meeting the
   overall service delay requirements.

   &#8226 Reward: Once the agent takes action a, it gets a reward to
   evaluate how well it took action a in state s.

   In MDP, the agent's goal is to find the optimal time-slot resource
   allocation strategy that maximizes the cumulative discount reward.

4.3.  Relationship Between Orchestration and Scheduling Algorithm

   In this problem, there are coupling constraints between the two
   phases.  In the resource orchestration phase, it is necessary to
   consider the subsequent transmission scheduling phase to ensure that
   tasks can be successfully transmitted under delay constraints.  At
   the same time, the transfer scheduling phase also needs to consider
   the output of the resource orchestration phase to ensure that the
   task is assigned to the correct server for computation.  In order to
   realize the two-stage closed-loop control, the following measures are
   designed:

   &#8226 Greedy sorting algorithm based on task flow delay
   requirements: the optimization goal of the first stage is to achieve
   load balancing and improve resource utilization.  In order to
   consider the impact of resource scheduling on the second stage
   transmission delay at the same time, the tolerance delay and resource


Tang & Zhang             Expires 9 January 2025                 [Page 6]

Internet-Draft          Deterministic Networking               July 2024


   demand characteristics of the task flow are considered in feature
   sorting, and weight factors are added to measure the relationship
   between the two.

   &#8226 Introduction of constraint variables: The transfer scheduling
   phase needs to take into account the output of the resource
   orchestration phase to ensure that the task is assigned to the
   correct server for computation.  In order to ensure that the packets
   of the second stage task flow are transmitted to the server that has
   been selected for the resource orchestration decision of the first
   stage, a ternary binary variable is added to constrain this
   relationship when modeling the problem.  For a task flow, the value
   of the ternary variable is 1 only when the packets of the second
   stage are transmitted to the decision server of the first stage,
   otherwise it is 0.

   &#8226 Feedback design of reward function: the resource utilization
   achieved in resource orchestration stage is included in the
   transmission scheduling algorithm reward, so as to capture the
   interaction between the two stages and realize closed-loop task
   processing control.

5.  IANA Considerations

   This section will be described later.

6.  Security Considerations

   This document should not affect the security of the Internet.

7.  Acknowledgements

   TBA

8.  Informative References

   [IEEE802.1Qbv]
              IEEE, "IEEE Standard for Local and metropolitan area
              networks -- Bridges and Bridged Networks - Amendment 25:
              Enhancements for Scheduled Traffic", IEEE 802.1Qbv-2015,
              DOI 10.1109/IEEESTD.2016.8613095, 18 March 2016,
              <https://doi.org/10.1109/IEEESTD.2016.8613095>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.


Tang & Zhang             Expires 9 January 2025                 [Page 7]

Internet-Draft          Deterministic Networking               July 2024


Authors' Addresses

   Nian Tang
   Beijing Jiaotong University
   Beijing
   100044
   China
   Email: 22120127@bjtu.edu.cn


   Weiting Zhang
   Beijing Jiaotong University
   Beijing
   100044
   China
   Email: wtzhang@bjtu.edu.cn


Tang & Zhang             Expires 9 January 2025                 [Page 8]