Internet-Draft Automating Distributed Processing July 2023
Oh, et al. Expires 11 January 2024 [Page]
Workgroup:
Internet Research Task Force
Internet-Draft:
draft-oh-nmrg-ai-adp-00
Published:
Intended Status:
Informational
Expires:
Authors:
S-B. Oh
KSA
Y-G. Hong
Daejeon University
J-S. Youn
DONG-EUI University
H-K. Kahng
Korea University

Network management by automating distributed processing based on artificial intelligence

Abstract

This document discusses the use of AI technology to automate the management of computer network resources distributed across different locations. AI-based network management by automating distributed processing involves utilizing deep learning algorithms to analyze network traffic, identify potential issues, and take proactive measures to prevent or mitigate those issues. Network administrators can efficiently manage and optimize their networks, thereby improving network performance and reliability. AI-based network management also aids in optimizing network performance by identifying bottlenecks in the network and automatically adjusting network settings to enhance throughput and reduce latency. By implementing AI-based network management through automated distributed processing, organizations can improve network performance, and reduce the need for manual network management tasks.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 11 January 2024.

Table of Contents

1. Introduction

Due to industrial digitalization, the number of devices connected to the network is increasing rapidly. As the number of devices increases, the amount of data that needs to be processed in the network is increasing due to the interconnection between various devices.

Existing network management was managed manually by administrators/operators, but network management becomes complicated, and the possibility of network malfunction increases, which can cause serious damage.

Therefore, this document considers the configuration of systems using artificial intelligence (AI) technology for network management and operation, in order to adapt to the dynamically changing network environment. In this regard, AI technologies maximize the utilization of network resources by providing resource access control and optimal task distribution processing based on the characteristics of nodes that offer network functions for network management automation and operation[I-D.irtf-nmrg-ai-challenges].

2. Conventional Task Distributed Processing Techniques and Problems

2.1. Challenges and Alternatives in Task Distributed Processing

Conventional Task Distributed Processing Techniques refer to methods and approaches used to distribute computational tasks among multiple nodes in a network. These techniques are typically used in distributed computing environments to improve the efficiency and speed of processing large volumes of data.

Some common conventional techniques used in task distributed processing include load balancing, parallel processing, and pipelining. Load balancing involves distributing tasks across multiple nodes in a way that minimizes the overall workload of each node, while parallel processing involves dividing a single task into multiple sub-tasks that can be processed simultaneously. Pipelining involves breaking a task into smaller stages, with each stage being processed by a different node.

However, conventional task distributed processing techniques also face several challenges and problems. One of the main challenges is ensuring that tasks are distributed evenly among nodes, so that no single node is overburdened while others remain idle. Another challenge is managing the communication between nodes, as this can often be a bottleneck that slows down overall processing speed. Additionally, fault tolerance and reliability can be problematic, as a single node failure can disrupt the entire processing workflow.

To address these challenges, new techniques such as edge computing, and distributed deep learning are being developed and used in modern distributed computing environments. The optimal resource must be allocated according to the characteristics of the node that provides the network function. Cloud servers generally have more powerful performance. However, to transfer data from the local machine to the cloud, it is necessary to move across multiple access networks, and it takes high latency and energy consumption because it processes and delivers a large number of packets. The MEC server is less powerful and less efficient than the cloud server, but it can be more efficient considering the overall delay and energy consumption because it is placed closer to the local machine[MEC.IEG006]. These architectures combine computing energy, telecommunications, storage, and energy resources flexibly, requiring service requests to be handled in consideration of various performance trade-offs.

The existing distributed processing technique can divide the case according to the subject performing the service request as follows.

(1) All tasks are performed on the local machine.


      Local Machine
  +-------------------+
  | Perform all tasks |
  | on local machine  |
  |                   |
  |    +---------+    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    +---------+    |
  |       Local       |
  +-------------------+

Figure 1: All tasks on local machine

(2) Some of the tasks are performed on the local machine and some are performed on the MEC server.


      Local Machine              MEC Server
  +-------------------+    +-------------------+
  |   Perform tasks   |    |   Perform tasks   |
  | on local machine  |    |   on MEC server   |
  |                   |    |                   |
  |    +---------+    |    |  +-------------+  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    +---------+    |    |  +-------------+  |
  |       Local       |    |        MEC        |
  +-------------------+    +-------------------+

Figure 2: Some tasks on local machine and MEC server

(3) Some of the tasks are performed on local machine and some are performed on cloud server


      Local Machine            Cloud Server
  +-------------------+    +-------------------+
  |   Perform tasks   |    |   Perform tasks   |
  | on local machine  |    |  on cloud server  |
  |                   |    |                   |
  |    +---------+    |    |  +-------------+  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    +---------+    |    |  +-------------+  |
  |       Local       |    |       Cloud       |
  +-------------------+    +-------------------+

Figure 3: Some tasks on local machine and cloud server

(4) Some of the tasks are performed on local machine, some on MEC servers, some on cloud servers


      Local Machine              MEC Server             Cloud Server
  +-------------------+    +-------------------+    +-------------------+
  |   Perform tasks   |    |   Perform tasks   |    |   Perform tasks   |
  | on local machine  |    |   on MEC server   |    |  on cloud server  |
  |                   |    |                   |    |                   |
  |    +---------+    |    |  +-------------+  |    |  +-------------+  |
  |    |         |    |    |  |             |  |    |  |             |  |
  |    |         |    |    |  |             |  |    |  |             |  |
  |    |         |    |    |  |             |  |    |  |             |  |
  |    |         |    |    |  |             |  |    |  |             |  |
  |    +---------+    |    |  +-------------+  |    |  +-------------+  |
  |       Local       |    |        MEC        |    |        Cloud      |
  +-------------------+    +-------------------+    +-------------------+

Figure 4: Some tasks on local machine, MEC server, and cloud server

(5) Some of the tasks are performed on the MEC server and some are performed on the cloud server


        MEC Server              Cloud Server
  +-------------------+    +-------------------+
  |   Perform tasks   |    |   Perform tasks   |
  |   on MEC server   |    |  on cloud server  |
  |                   |    |                   |
  |    +---------+    |    |  +-------------+  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    |         |    |    |  |             |  |
  |    +---------+    |    |  +-------------+  |
  |        MEC        |    |       Cloud       |
  +-------------------+    +-------------------+

Figure 5: Some tasks on MEC server and cloud server

(6) All tasks are performed on the MEC server


        MEC Server
  +-------------------+
  | Perform all tasks |
  |   on MEC server   |
  |                   |
  |    +---------+    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    +---------+    |
  |        MEC        |
  +-------------------+

Figure 6: All tasks on MEC server

(7) All tasks are performed on cloud servers


      Cloud Server
  +-------------------+
  | Perform all tasks |
  |  on cloud server  |
  |                   |
  |    +---------+    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    |         |    |
  |    +---------+    |
  |       Cloud       |
  +-------------------+

Figure 7: All tasks on cloud server

2.2. Considerations for Resource Allocation in Task Distributed Processing

In addition, it is necessary to consider various environments depending on the delay time and the importance of energy consumption to determine which source is appropriate to handle requests for resource use. The importance of delay time and energy consumption depends on the service requirements for resource use. There is a need to adjust the traffic flow according to service requirements.

3. Requirements of Conventional Task Distributed Processing

The requirements of task distributed processing refer to the key elements that must be considered and met to effectively distribute computing tasks across multiple nodes in a network. These requirements include:

Meeting these requirements is essential to the successful implementation and operation of task distributed processing systems. The effective distribution of tasks across multiple nodes in a network can improve overall system performance and efficiency, while also increasing fault tolerance and scalability.

4. Automating Distributed Processing using Artificial Intelligence

Automating distributed processing using AI refers to the use of AI technologies, such as machine learning and deep learning, to automate the distribution and processing of tasks across a network.

In traditional distributed processing systems, tasks are distributed manually or based on predetermined rules, which can lead to inefficiencies and suboptimal performance. However, by leveraging AI technologies, distributed processing can be automated in a way that maximizes performance and minimizes delays or bottlenecks.

AI algorithms can analyze network conditions and user demand in real-time, allowing for dynamic task distribution and processing based on current network conditions. For example, an AI-based distributed processing system might use machine learning algorithms to analyze network traffic patterns and identify areas of congestion or bottlenecks. The system could then automatically reroute tasks to less congested areas of the network, reducing delays and improving overall performance.

In addition to optimizing task distribution, AI can also be used to optimize task processing. For example, AI algorithms can analyze the characteristics of individual tasks and distribute them to nodes in the network that are best suited to handle them, based on factors such as processing power or available memory. This can improve processing efficiency and reduce processing times.

Overall, automating distributed processing using AI can improve network performance, reduce delays, and increase efficiency, making it a valuable tool for network management and operations.

To automate distributed processing using AI technology, various types of data can be used as training data. Here are some common data types:

This data can be collected from real network environments and used for training AI through appropriate data collection and preprocessing processes.

5. IANA Considerations

There are no IANA considerations related to this document.

6. Security Considerations

When providing AI services, it is essential to consider security measures to protect sensitive data such as network configurations, user information, and traffic patterns. Robust privacy measures must be in place to prevent unauthorized access and data breaches.

Implementing effective access control mechanisms is essential to ensure that only authorized personnel or systems can access and modify the network management infrastructure. This involves managing user privileges, using authentication mechanisms, and enforcing strong password policies.

Maintaining the security and integrity of the training data used for AI models is vital. It is important to ensure that the training data is unbiased, representative, and free from malicious content or data poisoning. This is crucial for the accuracy and reliability of the AI models.

7. Acknowledgements

TBA

8. Informative References

[I-D.irtf-nmrg-ai-challenges]
François, J., Clemm, A., Papadimitriou, D., Fernandes, S., and S. Schneider, "Research Challenges in Coupling Artificial Intelligence and Network Management", Work in Progress, Internet-Draft, draft-irtf-nmrg-ai-challenges-00, , <https://datatracker.ietf.org/doc/html/draft-irtf-nmrg-ai-challenges-00>.
[MEC.IEG006]
ETSI, "Mobile Edge Computing; Market Acceleration; MEC Metrics Best Practice and Guidelines", Group Specification ETSI GS MEC-IEG 006 V1.1.1 (2017-01), .

Authors' Addresses

SeokBeom Oh
KSA
Digital Transformation Center, 5
Teheran-ro 69-gil, Gangnamgu
Seoul
06160
South Korea
Yong-Geun Hong
Daejeon University
62 Daehak-ro, Dong-gu
Daejeon
34520
South Korea
Joo-Sang Youn
DONG-EUI University
176 Eomgwangno Busan_jin_gu
Busan
614-714
South Korea
Hyun-Kook Kahng
Korea University
2511 Sejong-ro
Sejong City