Home › Knowledge Base › DeepMind Control

DeepMind Control

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Mean Flow Policy Optimization

Announce Type: replace Abstract: Diffusion models have recently emerged as expressive policy representations for online reinforcement learning (RL). However, their iterative generative processes introduce substantial training and inference overhead. To overcome this limitation, we propose to represent policies using MeanFlow models, a class of few-step flow-based generative models, to improve training and inference efficiency over diffusion-based RL approaches.

arXiv CS 8d ago

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

Announce Type: replace Abstract: We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a latent space in which the true value function is approximately linear, our method supports a single set of hyperparameters across diverse domains -- from continuous control with...

arXiv CS 6d ago

Reflex: Reinforcement Learning with Reflection Symmetry Exploitation in State-Based Continuous Control

Announce Type: replace Abstract: Reinforcement learning has long struggled with poor sample efficiency. One promising approach to mitigate this problem is leveraging group-invariant Markov Decision Processes ($G$-invariant MDPs). Existing works in this direction have primarily focused on image-based RL and rotational symmetry such as $\mathrm{SO(2)}$, leaving state-based RL and reflection symmetry largely underexplored.

arXiv CS 5d ago

Reward Learning through Ranking Mean Squared Error

arXiv:2601.09236v3 Announce Type: replace Abstract: Reward design remains a significant bottleneck in applying reinforcement learning (RL) to real-world problems. A popular alternative is reward learning, where reward functions are inferred from human feedback rather than manually specified. Recent work has proposed learning reward functions from human ratings rather than traditional binary preferences, enabling richer and potentially less cognitively demanding supervision.

arXiv CS 5d ago

China poaches more AI talent from the U.S. as it eyes the next 'super-app'

BEIJING — A former OpenAI researcher is now chief AI scientist for Tencent in China, and wants to build artificial general intelligence. It's a sign of a shift in the U.S.-China tech race. AI with human-level or above capabilities (AGI) has long been the goal of U.S. companies such as OpenAI, Anthropic and Alphabet, which acquired British startup DeepMind.

CNBC 5d ago

Google's Pentagon AI deal reportedly drove the DeepMind team to unionize

Google’s Pentagon AI deal reportedly drove the DeepMind team to unionize Google's UK-based DeepMind workers have voted to unionize. They're sending the company's management a letter asking it to recognize the Communication Workers Union and Unite the Union as their representatives, according to The Guardian. The workers voted back in April, driven by reports that the company was close to reaching a deal with the US Defense Department.

Engadget 36d ago

OpenAI and Anthropic Sign Letter to Prevent AI-Developed Biological Weapons

The CEOs of several major artificial intelligence companies are urging members of Congress to adopt new laws that would make it harder for bad actors to develop biological weapons using their technology. Google DeepMind’s Demis Hassabis, OpenAI’s Sam Altman, Anthropic’s Dario Amodei, and Microsoft AI’s Mustafa Suleyman are among the signatories on a public letter calling for laws requiring companies that sell synthetic DNA and RNA to screen customers and orders to prevent the misuse of...

Wired 6d ago

Daily briefing: Trial to ‘de-age’ cells treats first person

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer).

Nature 1d ago

Jeff Bezos Is Funding a Wild Hunt for the Brain’s ‘Core Algorithm’

Rob Williams knows how to pitch Jeff Bezos: You write a press release as if your product has already been built. Bezos reads it and gives a thumbs up or down. Williams went through this process a lot as an executive on Amazon’s “S-team,” in charge of software products such as Alexa, until his departure last fall.

Wired 6d ago

The people who actually want AI to replace humanity

“I want AI to be a tool that allows human flourishing!” exclaimed Brad Carson, a former member of Congress. “There is an option out there where AI is just a tool for us.” The people who actually want AI to replace humanity We need to create a new humanism before the “AI successionists” win.

Hacker News 10d ago