Home › Knowledge Base › Q-Network

Q-Network

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

DNQ: Deep Nash Q-Network for Partially Observable n-Player Games

arXiv:2606.06480v1 Announce Type: new Abstract: Many real-world competitive systems require multiple decision-makers to act simultaneously under shared constraints, limited information, and repeated interaction, as in auctions, resource allocation, and security competition. We study multi-turn simultaneous bidding as a controlled testbed for such problems and propose DNQ, a solver-in-the-loop equilibrium supervision framework for training bidding agents. DNQ alternates between trajectory...

arXiv CS 5d ago

A Human-Sensitive Controller: Adapting to Human Musculoskeletal Disorder-Related Constraints via Reinforcement Learning

Announce Type: replace Abstract: Work-Related Musculoskeletal Disorders continue to be a major challenge in industrial environments, leading to reduced workforce participation, increased healthcare costs, and long-term disability. This study introduces a human-sensitive robotic system aimed at reintegrating individuals with a history of musculoskeletal disorders into standard job roles, while simultaneously optimizing ergonomic conditions for the broader workforce. This research leverages...

arXiv CS 2d ago

AISC deployment in dynamic UAV-assisted MEC network: a reinforcement learning method based on heterogeneous graph attention neural network

Announce Type: new Abstract: Unmanned aerial vehicles-assisted mobile edge computing (UMEC) can execute compute-intensive and latency-critical artificial intelligence (AI) services, which can be provided by multiple UAVs collaborating in the air to perform inference tasks. Completing an AI service requires multiple inferences, each of which is implemented by an AI service chain consisting of multiple virtual network functions (VNFs). The application of AISC relies on an efficient AISC...

arXiv CS 5d ago

Temporally Encoded Double DQN for Proactive PRB Allocation in O-RAN Enabled Industrial Networks

arXiv:2605.30630v1 Announce Type: new Abstract: Fifth-generation (5G) wireless systems are increasingly adopted in smart manufacturing to support heterogeneous industrial workloads through services such as enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC). However, industrial traffic is inherently process-driven and temporally correlated. So, static or reactive schedulers in the Open Radio Access Network (O-RAN) are inadequate for such non-stationary...

arXiv CS 9d ago

Scalable Reinforcement Learning via Adaptive Batch Scaling

arXiv:2605.21557v2 Announce Type: replace-cross Abstract: Conventional wisdom holds that large-batch training is fundamentally incompatible with Reinforcement Learning (RL) - beyond a modest threshold, increasing batch sizes typically yields diminishing returns or performance degradation due to the inherent non-stationarity of the data distribution. We challenge this view by observing that non-stationarity is not a fixed property of RL, but evolves throughout training: early stages exhibit...

arXiv CS 5d ago

A Reliable Self-Organized Distributed Complex Network for Communication of Smart Agents

arXiv:2503.07702v3 Announce Type: replace Abstract: Collaboration among distributed agents is fundamental to many complex systems, particularly in communication networks where connectivity must be maintained under energy constraints. In this study, we utilize intelligent agents (nodes) trained through reinforcement learning techniques to establish connections with their neighbors, ultimately leading to the emergence of a large-scale communication cluster. Notably, there is no centralized...

arXiv CS 5d ago

Neetyabhas: A Framework for Uncertainty-Aware Public Policy Optimization in Rational Agent-Based Models

arXiv:2606.04562v1 Announce Type: new Abstract: Purpose The WHO's COVID-19 non-pharmaceutical interventions (e.g., lockdowns, vaccinations) effectively curb transmission but impose heavy economic strains. Existing research often neglects individual behaviors and falsely assumes perfect infection tracking and flawless policy execution, failing to account for real-world uncertainties and errors.

arXiv CS 6d ago

Quantum-Inspired Reinforcement Learning for Low-Latency Intrusion Detection in V2X and Internet-of-Vehicles Networks

Announce Type: new Abstract: Smart cities increasingly depend on dense edge, IoT, and vehicular networks to deliver critical urban services, including traffic control, connected mobility, infrastructure monitoring, and energy management. In this ecosystem, the Internet of Vehicles (IoV) is central to intelligent transportation, enabling continuous communication among vehicles, roadside infrastructure, and cloud-edge platforms. This connectivity, however, also enlarges the attack surface and...

arXiv CS 1d ago

Jamming-Resilient PRB Reservation for Latency-Critical O-RAN Network Slicing

Announce Type: new Abstract: Open radio access network (O-RAN) architectures enable near real-time, software-driven control of network slicing through programmable xApps deployed on the near-real-time RAN Intelligent Controller (near-RT RIC). In industrial 5G downlink systems, adversarial jamming can abruptly reduce the effective physical resource block (PRB) capacity, triggering queue buildup and persistent latency violations, particularly in the presence of low spectral efficiency cell...

arXiv CS 9d ago

Learning to Bet for Horizon-Aware Anytime-Valid Testing

Announce Type: replace-cross Abstract: We develop horizon-aware anytime-valid tests and confidence sequences for bounded means under a strict deadline $N$. Using the betting/e-process framework, we cast horizon-aware betting as a finite-horizon optimal control problem with state space $(t, \log W_t)$, where $t$ is the time and $W_t$ is the test martingale value. We first show that in certain interior regions of the state space, policies that deviate significantly from Kelly betting are...

arXiv CS 7d ago