Trajectory Transformers and Decision Transformers
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Generalizable Multi-Task Learning for Wireless Networks Using Prompt Decision Transformers
arXiv:2606.04328v1 Announce Type: new Abstract: Future wireless networks demand rapid adaptation to highly heterogeneous environments and dynamic task configurations, necessitating a shift from conventional rule-based and optimization-driven radio resource management (RRM) toward artificial intelligence (AI)-driven RRM. AI-driven approaches can learn complex nonlinear relationships, generalize across diverse network conditions and enable real-time, scalable and autonomous decision-making....
Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies
Announce Type: new Abstract: In this work we study offline reinforcement learning (RL) under temporally extended task constraints expressed in Linear Temporal Logic over finite traces (LTLf). Recently, transformer-based approaches such as Trajectory Transformers and Decision Transformers have been adopted to address RL as a sequence modeling problem. However, these methods optimize purely for reward and do not account for high-level temporal requirements.
Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success
arXiv:2601.18175v2 Announce Type: replace Abstract: A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that...
India–Nepal ties: Jaishankar calls for ‘decisive shift’; Kathmandu signals reset
NEW DELHI: India and Nepal share a “very special relationship” and there is now an opportunity to “decisively shift the trajectory” of bilateral engagement to realise its full potential, external affairs minister S Jaishankar said on Saturday during talks with Nepal foreign minister Shisir Khanal in New Delhi. In his opening remarks, Jaishankar underlined the depth of the partnership, saying, “India and Nepal share a very special relationship, one which is built on a strong foundation of...
The people who actually want AI to replace humanity
“I want AI to be a tool that allows human flourishing!” exclaimed Brad Carson, a former member of Congress. “There is an option out there where AI is just a tool for us.” The people who actually want AI to replace humanity We need to create a new humanism before the “AI successionists” win.
Post-Training LLMs as Better Decision-Making Agents: A Regret-Minimization Approach
Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed as "agents" for decision-making (DM) in interactive and dynamic environments. Yet, since they were not originally designed for DM, recent studies show that LLMs can struggle even in basic online DM problems, failing to achieve low regret or an effective exploration-exploitation tradeoff. To address this, we introduce Iterative Regret-Minimization Fine-Tuning (Iterative RMFT), a post-training procedure...
Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning
Announce Type: replace Abstract: Diffusion probability models have shown significant promise in offline reinforcement learning by directly modeling trajectory sequences. However, existing approaches primarily focus on time-domain features while overlooking frequency-domain features, leading to frequency shift and degraded performance according to our observation. In this paper, we investigate the RL problem from a new perspective of the frequency domain.
MMSkills: Towards Multimodal Skills for General Visual Agents
Announce Type: replace Abstract: Reusable skills have become a core substrate for improving agent capabilities, yet most existing skill packages encode reusable behavior primarily as textual prompts, executable code, or learned routines. For visual agents, however, procedural knowledge is inherently multimodal: reuse depends not only on what operation to perform, but also on recognizing the relevant state, interpreting visual evidence of progress or failure, and deciding what to do next. We...
Nepal seeks 'transformative' ties, says no grudge against India
Nepal does not carry any old baggage against India and is determined to build a genuinely transformative relationship with its “close neighbor and most important partner,” said Nepal foreign minister Shisir Khanal as he met his counterpart S Jaishankar on Saturday, following another round of diplomatic strife over the border dispute. Welcoming Khanal, Jaishankar said earlier India’s clear message to the new Nepal government was of collaboration and cooperation as there’s an opportunity today...
Representation Signatures and Risk-Feedback Alignment in LLM Trading Agents
arXiv:2605.28850v2 Announce Type: replace Abstract: We study behavioral alignment and representation dynamics of large language model (LLM) agents in financial decision environments. TradeArena, an auditable trading-agent testbed with risk reports, execution simulation, memory, and replayable trajectories, lets us analyze how rationales, positions, and interventions evolve under market stress. Code and data artifacts are available through the...