State- Conditional Adversarial Learning
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
State-Conditional Adversarial Learning: An Off-Policy Visual Domain Transfer Method for End-to-End Imitation Learning
Announce Type: replace Abstract: We study visual domain transfer for end-to-end imitation learning in a realistic and challenging setting where target-domain data are strictly off-policy, expert-free, and scarce. We first provide a theoretical analysis showing that the target-domain imitation loss can be upper bounded by the source-domain loss plus a state-conditional latent KL divergence between source and target observation models. Guided by this result, we propose State- Conditional...
Online Learning in MDPs with Partially Adversarial Transitions and Losses
arXiv:2602.09474v2 Announce Type: replace Abstract: We study reinforcement learning in MDPs whose transition function is stochastic at most steps but may behave adversarially at a fixed subset of $\Lambda$ steps per episode. This model captures environments that are stable except at a few vulnerable points. We introduce \emph{conditioned occupancy measures}, which remain stable across episodes even with adversarial transitions, and use them to design two algorithms.
SHIELD: Secure Hypernetworks for Incremental Expansion Learning Defense
Announce Type: replace Abstract: Continual learning under adversarial conditions remains an open problem, as existing methods often compromise either robustness, scalability, or both. We propose a novel framework that integrates Interval Bound Propagation (IBP) with a hypernetwork-based architecture to enable certifiably robust continual learning across sequential tasks. Our method, SHIELD, generates task-specific model parameters via a shared hypernetwork conditioned solely on compact task...
Scalable GANs with Transformers
Announce Type: replace Abstract: Scalability has driven recent advances in generative modeling, yet its principles remain underexplored for adversarial learning. We investigate the scalability of Generative Adversarial Networks (GANs) through two design choices that have proven to be effective in other types of generative models: training in a compact Variational Autoencoder latent space and adopting purely transformer-based generators and discriminators. Training in latent space enables...
T-GMP: Terrain-conditioned Generative Motion Priors for Versatile and Natural Humanoid Locomotion
arXiv:2606.06944v1 Announce Type: new Abstract: Achieving both anthropomorphic naturalness and robust terrain traversal remains a fundamental challenge in humanoid locomotion. Existing Reinforcement Learning (RL) approaches typically rely on fixed motion priors, limiting their adaptability to varying environments. We propose Terrain-conditioned Generative Motion Priors (T-GMP), a module that captures a terrain-conditioned latent motion manifold from a few expert state-terrain demonstrations...
AutoPilot: Learning to Steer High Speed Robust BFT
arXiv:2606.09120v1 Announce Type: new Abstract: Recent Byzantine Fault Tolerant (BFT) protocols achieve strong performance by combining the low-latency advantages of leader-based BFT protocols with the high-throughput benefits of DAG-based data dissemination. Despite exposing a wide spectrum of internal tunable parameters, these protocols typically rely on static and heuristic configurations, which leads to performance degradation under dynamic workloads, heterogeneous network conditions,...
Consistency Training Along the Transformer Stack
Announce Type: new Abstract: Consistency training encourages models to behave similarly across different contexts, and has shown promise for reducing misalignment. We broaden the scope of consistency training in two ways. First, we introduce two new internal consistency targets: MLP Consistency Training (MLPCT), which matches post-activation MLP states, and Attention Consistency Training (AttCT), which matches per-head attention distributions.
A Lecture Note on Offline RL and IRL, Part II: Foundations of Inverse Reinforcement Learning and Dynamic Discrete Choice Models
arXiv:2605.30843v1 Announce Type: new Abstract: In the forward reinforcement-learning problem, the reward is fixed and known; the learner is asked to find a good policy or value function. Here we turn the question around. Given offline data generated by an expert, can we recover the reward the expert was optimizing?
The back-channel bid to go soft on Maduro
When Marco Rubio was named secretary of State, many in both South Florida Republican circles and the American energy industry exulted. But one man who bridged both worlds knew he had a problem. A longtime investor in Venezuela, the main source of crude oil needed to produce the asphalt that had made his family rich, Harry Sargeant III kept relations with top officials in Caracas even as they seized most foreign oil holdings.