QoS
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Policy-Guided ML for Energy Savings: Cell On/Off Switching under Operator QoS Constraints in Real 5G Networks
arXiv:2606.05755v1 Announce Type: new Abstract: Energy efficiency is a critical concern in the deployment and operation of 5G networks, particularly due to the low utilization of 4G and 5G carriers during off-peak hours. While considerable research has focused on designing energy-efficient cell on/off switching strategies that avoid disrupting user connectivity, the integration of operator-specific policies to guarantee particular Quality of Service (QoS) levels has received limited...
A Practical AI-Driven Strategy for Cell On/Off Switching under Adaptable QoS Constraints
arXiv:2606.05019v1 Announce Type: new Abstract: The rapid expansion of 5G networks has intensified concerns over their sustainability, as denser Radio Access Network (RAN) deployments have increased overall power consumption. Although numerous studies have examined energy-efficient cell on/off switching, few have focused on approaches capable of dynamically adapting to operator-defined Quality of Service (QoS) requirements. In this paper, we propose a Long Short Term Memory (LSTM)based...
DriftSched: Adaptive QoS-Aware Scheduling under Runtime Token Drift for Multi-Tenant GPU Inference
arXiv:2606.02982v1 Announce Type: new Abstract: The rapid growth of large language model (LLM) inference services has increased the demand for efficient multi-tenant GPU scheduling. While modern inference runtimes such as vLLM improve throughput through continuous batching and optimized memory management, accurately estimating the runtime cost of heterogeneous inference requests remains a significant challenge.
Quantifying the Energy-Saving and QoS Trade-Off in Traffic Offloading for Real 4G/5G Scenarios
arXiv:2606.05752v1 Announce Type: new Abstract: Despite the potential for higher energy efficiency in 5G networks, current 5G Non-Standalone (NSA) deployments often operate suboptimally due to low utilization of 4G and 5G carriers during extended periods. Since base stations are the primary contributors to network energy consumption, implementing cell on/off switching and traffic offloading strategies is crucial for enhancing energy efficiency in current deployments. This paper investigates...
CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees
arXiv:2606.02358v1 Announce Type: new Abstract: We present Chimera, a flexible and scalable Microcontroller Unit (MCU) designed to accelerate real-time inference of rapidly evolving transformer-based models at the ultra-low-power edge (hundred of mW). The chip, implemented in 22 nm FDX technology, integrates a transformer accelerator tightly coupled within a compute cluster featuring nine general-purpose RV32IMA cores. Scalability extends to the memory hierarchy through a novel L2 memory...
Beyond Greedy Chunking: SLO-Aware Sliding-Window Scheduling for LLM Inference
arXiv:2606.05933v1 Announce Type: new Abstract: With the rapid growth of interactive applications in large language model (LLM) online services, maintaining high system throughput while ensuring user-perceived latency has become a key issue in inference scheduling. Existing LLM service systems rely on coarse-grained output constraints, making it difficult to effectively handle resource contention among multiple requests, resulting in low resource utilization efficiency and limited support...
Robust Restless Multi-Armed Bandit for Data Center Flexibility Services Through Virtual Machine Scheduling
arXiv:2605.19116v2 Announce Type: replace Abstract: Energy demands from data centers have surged and stressed the grid in recent years. Electric grids require balancing supply and demand every second, motivating demand response (reduction) from large loads, including data centers. This can be achieved by rescheduling jobs on a physical machine.
BRAIN: Bayesian Reasoning via Active Inference for Agentic and Embodied Intelligence in Mobile Networks
arXiv:2602.14033v1 Announce Type: cross Abstract: Future sixth-generation (6G) mobile networks will demand artificial intelligence (AI) agents that are not only autonomous and efficient, but also capable of real-time adaptation in dynamic environments and transparent in their decisionmaking. However, prevailing agentic AI approaches in networking, exhibit significant shortcomings in this regard.