Home › Knowledge Base › QoS

QoS

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Policy-Guided ML for Energy Savings: Cell On/Off Switching under Operator QoS Constraints in Real 5G Networks

arXiv:2606.05755v1 Announce Type: new Abstract: Energy efficiency is a critical concern in the deployment and operation of 5G networks, particularly due to the low utilization of 4G and 5G carriers during off-peak hours. While considerable research has focused on designing energy-efficient cell on/off switching strategies that avoid disrupting user connectivity, the integration of operator-specific policies to guarantee particular Quality of Service (QoS) levels has received limited...

arXiv CS 5d ago

A Practical AI-Driven Strategy for Cell On/Off Switching under Adaptable QoS Constraints

arXiv:2606.05019v1 Announce Type: new Abstract: The rapid expansion of 5G networks has intensified concerns over their sustainability, as denser Radio Access Network (RAN) deployments have increased overall power consumption. Although numerous studies have examined energy-efficient cell on/off switching, few have focused on approaches capable of dynamically adapting to operator-defined Quality of Service (QoS) requirements. In this paper, we propose a Long Short Term Memory (LSTM)based...

arXiv CS 6d ago

DriftSched: Adaptive QoS-Aware Scheduling under Runtime Token Drift for Multi-Tenant GPU Inference

arXiv:2606.02982v1 Announce Type: new Abstract: The rapid growth of large language model (LLM) inference services has increased the demand for efficient multi-tenant GPU scheduling. While modern inference runtimes such as vLLM improve throughput through continuous batching and optimized memory management, accurately estimating the runtime cost of heterogeneous inference requests remains a significant challenge.

arXiv CS 7d ago

Quantifying the Energy-Saving and QoS Trade-Off in Traffic Offloading for Real 4G/5G Scenarios

arXiv:2606.05752v1 Announce Type: new Abstract: Despite the potential for higher energy efficiency in 5G networks, current 5G Non-Standalone (NSA) deployments often operate suboptimally due to low utilization of 4G and 5G carriers during extended periods. Since base stations are the primary contributors to network energy consumption, implementing cell on/off switching and traffic offloading strategies is crucial for enhancing energy efficiency in current deployments. This paper investigates...

arXiv CS 5d ago

CHIMERA: A Flexible and Scalable 3.1 TOPS/W AI-MCU with Transformer Accelerator and 563 Gb/s Shared-L2 Memory Subsystem with QoS Guarantees

arXiv:2606.02358v1 Announce Type: new Abstract: We present Chimera, a flexible and scalable Microcontroller Unit (MCU) designed to accelerate real-time inference of rapidly evolving transformer-based models at the ultra-low-power edge (hundred of mW). The chip, implemented in 22 nm FDX technology, integrates a transformer accelerator tightly coupled within a compute cluster featuring nine general-purpose RV32IMA cores. Scalability extends to the memory hierarchy through a novel L2 memory...

arXiv CS 8d ago

Beyond Greedy Chunking: SLO-Aware Sliding-Window Scheduling for LLM Inference

arXiv:2606.05933v1 Announce Type: new Abstract: With the rapid growth of interactive applications in large language model (LLM) online services, maintaining high system throughput while ensuring user-perceived latency has become a key issue in inference scheduling. Existing LLM service systems rely on coarse-grained output constraints, making it difficult to effectively handle resource contention among multiple requests, resulting in low resource utilization efficiency and limited support...

arXiv CS 5d ago

Robust Restless Multi-Armed Bandit for Data Center Flexibility Services Through Virtual Machine Scheduling

arXiv:2605.19116v2 Announce Type: replace Abstract: Energy demands from data centers have surged and stressed the grid in recent years. Electric grids require balancing supply and demand every second, motivating demand response (reduction) from large loads, including data centers. This can be achieved by rescheduling jobs on a physical machine.

arXiv CS 2d ago

BRAIN: Bayesian Reasoning via Active Inference for Agentic and Embodied Intelligence in Mobile Networks

arXiv:2602.14033v1 Announce Type: cross Abstract: Future sixth-generation (6G) mobile networks will demand artificial intelligence (AI) agents that are not only autonomous and efficient, but also capable of real-time adaptation in dynamic environments and transparent in their decisionmaking. However, prevailing agentic AI approaches in networking, exhibit significant shortcomings in this regard.

arXiv CS 1d ago