Home Knowledge Base RT-PG

RT-PG

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Reusing Trajectories in Policy Gradients Enables Fast Convergence

Announce Type: replace Abstract: Policy gradient (PG) methods are a class of effective reinforcement learning algorithms, particularly when dealing with continuous control problems. They rely on fresh on-policy data, making them sample-inefficient and requiring $O(\epsilon^{-2})$ trajectories to reach an $\epsilon$-approximate stationary point.

arXiv CS 6d ago