Home Knowledge Base Decision Transformers

Decision Transformers

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Generalizable Multi-Task Learning for Wireless Networks Using Prompt Decision Transformers

arXiv:2606.04328v1 Announce Type: new Abstract: Future wireless networks demand rapid adaptation to highly heterogeneous environments and dynamic task configurations, necessitating a shift from conventional rule-based and optimization-driven radio resource management (RRM) toward artificial intelligence (AI)-driven RRM. AI-driven approaches can learn complex nonlinear relationships, generalize across diverse network conditions and enable real-time, scalable and autonomous decision-making....

arXiv CS 6d ago

Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies

Announce Type: new Abstract: In this work we study offline reinforcement learning (RL) under temporally extended task constraints expressed in Linear Temporal Logic over finite traces (LTLf). Recently, transformer-based approaches such as Trajectory Transformers and Decision Transformers have been adopted to address RL as a sequence modeling problem. However, these methods optimize purely for reward and do not account for high-level temporal requirements.

arXiv CS 1d ago

Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success

arXiv:2601.18175v2 Announce Type: replace Abstract: A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that...

arXiv CS 6d ago

Robust In-Context Reinforcement Learning Under Reward Poisoning Attacks

arXiv:2506.06891v3 Announce Type: replace Abstract: We study the corruption-robustness of in-context reinforcement learning (ICRL), focusing on the Decision-Pretrained Transformer (DPT, Lee et al., 2023). To address the challenge of reward poisoning attacks targeting the DPT, we propose a novel adversarial training framework, called Adversarially Trained DPT (AT-DPT). Our method simultaneously trains a population of attackers to minimize the true reward of the DPT by poisoning environment...

arXiv CS 1d ago

Beyond English benchmarks: clinical llm evaluation in Brazilian Portuguese

arXiv:2606.07853v1 Announce Type: new Abstract: Large Language Models are transforming the support for clinical decision and their application in real scenarios. Yet, most benchmarks are conducted in English, and cross-lingual evaluation is needed to tackle the language gaps in global access. We introduce ClinicalBr, the first bilingual benchmark for clinical decision built from real Brazilian case reports.

arXiv CS 1d ago

The people who actually want AI to replace humanity

“I want AI to be a tool that allows human flourishing!” exclaimed Brad Carson, a former member of Congress. “There is an option out there where AI is just a tool for us.” The people who actually want AI to replace humanity We need to create a new humanism before the “AI successionists” win.

Hacker News 10d ago

CritLens: Visual Analytics for Criteria Discovery in Review-Based Decision Making

Announce Type: new Abstract: We present CritLens, a visual analytics system that helps users build personalized multi-criteria decision models from review text. In everyday decisions -- choosing equipment, hotels, or restaurants -- evaluation criteria are either preset by platforms or generated by LLMs, leaving users unable to discover, adjust, or verify them against the underlying evidence. This is problematic because many preferences are latent: they surface only upon encountering specific...

arXiv CS 1d ago

ForensicConcept: Transferable Forensic Concepts for AIGI Detection

Announce Type: new Abstract: AI-generated image detectors achieve high accuracy on in-distribution data but often fail on unseen generators. A key obstacle to understanding this failure is the black-box nature of current detectors: they do not reveal which evidence drives their decisions. We propose ForensicConcept, a framework that extracts explicit forensic concepts from detectors and enables their transfer across backbones.

arXiv CS 2d ago

Self-Supervised Vision Transformers for CBCT-Based Detection of Temporomandibular Joint Osteoarthritis

new Abstract: Temporomandibular joint osteoarthritis (TMJ OA) is a prevalent degenerative condition whose osseous changes are often subtle on cone-beam CT (CBCT), making automated detection challenging. We study how well the DINO family of self-supervised vision transformers -- DINOv1, DINOv2, DINOv2+reg, and RAD-DINO (a radiology-pretrained variant) -- transfers to CBCT, asking how much backbone adaptation is needed and of what kind. We propose a simple slice-based pipeline using Vision...

arXiv CS 1d ago

Greener Than Humans? Environmental Attitudes in Large Language Models

arXiv:2606.02741v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in sustainability-related decision support, reporting, and public communication, yet little systematic evidence exists on the environmental attitudes embedded in their outputs. This paper develops a benchmark for evaluating environmental cognition, affect, and behavioural recommendations in LLMs and applies it to 31 widely used proprietary and open-weight models. Drawing on questions from...

arXiv CS 7d ago