Home Knowledge Base TDC

TDC

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning

arXiv:2605.31172v1 Announce Type: new Abstract: This work studies the convergence of two-timescale stochastic approximations (SA), a class of iterative algorithms that update two sets of parameters in fast and slow timescales respectively. Notable examples of two-timescale SA in reinforcement learning (RL) include temporal difference learning with gradient correction (TDC) and actor-critic methods.

arXiv CS 9d ago

Information-Theoretic Requirements for Gradient-Based Task Affinity Estimation in Multi-Task Learning

Announce Type: replace Abstract: Multi-task learning shows strikingly inconsistent results -- sometimes joint training helps substantially, sometimes it actively harms performance -- yet the field lacks a principled framework for predicting these outcomes. We identify a fundamental but unstated assumption underlying gradient-based task analysis: tasks must share training instances for gradient conflicts to reveal genuine relationships. When tasks are measured on the same inputs, gradient...

arXiv CS 1d ago