How Telemetry Enables Real‑Time Precise Compute Resource Subscription in Compute Networks
As compute networks expand, real‑time, precise awareness of compute resources becomes critical; this article analyzes the key technologies, overall workflow, and core modules of a Telemetry‑based subscription solution that delivers accurate, low‑latency resource monitoring and scheduling across heterogeneous edge and cloud environments.
With the rapid growth of compute networks, the scale and complexity of services increase, demanding more real‑time and accurate compute resource awareness to schedule workloads to optimal nodes. This article examines a Telemetry‑based approach for precise, real‑time subscription of compute resources, covering key technologies, overall process, and essential modules.
1.1 Telemetry Technology
Telemetry is a next‑generation network monitoring technology that pushes data from devices to collectors in a push mode, offering higher timeliness, speed, and precision compared to traditional pull‑based RESTful interfaces. It organizes data using YANG models, encodes it with Google Protocol Buffers (GPB), and transports it via gRPC, enabling efficient, high‑throughput data collection.
Push mode reduces device load.
Sub‑second intervals avoid latency‑induced inaccuracies.
Supports massive device monitoring, overcoming pull‑mode limitations.
Figure 1 illustrates the interaction between pull and push modes.
1.2 Compute Scheduling
The China Communications Standards Association defines a functional architecture for compute networks, separating service provision, control, resource, and management layers. Compute nodes—whether cloud, edge, or MEC—expose resources such as CPU, GPU, memory, storage, and network capacity, each tagged with capability attributes (type, architecture, location, capacity, etc.) to enable fine‑grained scheduling.
2.1 Overall Solution Overview
The Telemetry‑based subscription solution consists of two parts:
Device side: raw data, data model, encoding format, transport protocol.
Network‑management side: collection module, storage module, analysis module.
Both sides use dynamic subscription: the management side dynamically subscribes to device metrics, and devices push the selected data in real time. The overall architecture is shown in Figure 3.
2.2 Device‑Side Details
The device side must provide compute resource data such as node identifier, resource type (CPU, GPU, NPU), architecture (x86, ARM), total and used CPU, memory, storage, throughput, and energy consumption. Data is modeled with YANG, encoded with GPB, and transmitted via gRPC.
Raw Data
Includes metrics from central cloud, edge cloud, government cloud, IT cloud, and wireless devices.
Data Model
YANG defines the schema for configuration, state, RPC, and notifications, allowing both device and management sides to serialize and parse data consistently.
Encoding Format
GPB provides a language‑ and platform‑independent binary format with high performance and low bandwidth consumption.
Transport Protocol
gRPC, built on HTTP/2, carries the GPB‑encoded telemetry data, supporting both static and dynamic subscription modes.
2.3 Metric Extension
Existing Telemetry sampling files are extended to cover compute resources. Sample paths (e.g., Ddevm/attribute/Cloados/load) capture load ratio, location, type, precision, quality, and other attributes. Tables 1 and 2 list example paths and their sampling precision (ms).
2.4 Network‑Side Details
The network‑management side comprises three modules:
Collection Module: dynamically subscribes to device metrics and parses incoming data.
Storage Module: stores parsed metrics in relational, key‑value, file, or NoSQL stores.
Analysis Module: processes full‑scale compute resource data for orchestration, visualization, and provides services to upper layers.
3 Typical Application Scenarios
Real‑time compute resource awareness is essential for edge and wireless compute, where traditional pull‑based data collection cannot meet latency requirements. Telemetry enables proactive push from edge nodes, supports digital twin simulations for visualizing and orchestrating compute resources, and powers large‑screen dashboards for holistic network operation monitoring.
4 Conclusion
Telemetry, already widespread in network device monitoring, has significant potential for compute resource collection. Implementing a Telemetry‑based subscription simplifies the compute awareness logic in compute networks, enhances precise scheduling, and improves network‑topology visualization, thereby advancing the maturity of real‑time compute resource management.
References:
H3C, Telemetry Technology Whitepaper, 2021.
AsiaInfo Technologies, Compute Network Detailed Explanation Vol.1, 2022.
AsiaInfo Technology: New Tech Exploration
AsiaInfo's cutting‑edge ICT viewpoints and industry insights, featuring its latest technology and product case studies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
