Training Efficiency
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
A Training-Efficient Transformer-Based Anti-Spoofing Network for Logical Access in ASVspoof 5
arXiv:2606.02980v1 Announce Type: new Abstract: Synthetic and manipulated speech can reduce the reliability of automatic speaker verification systems, so anti-spoofing methods need to be both accurate and efficient in training and inference. This paper focuses on the ASVspoof 5 Track 1 closed condition, where standard cross-entropy training may not give enough attention to hard trials and is not directly aligned with ranking- and threshold-based evaluation metrics. We propose TFPARN, a...
POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation
arXiv:2603.05500v2 Announce Type: replace Abstract: Efficient and stable training of large language models (LLMs) remains a core challenge in modern machine learning systems. To address this challenge, Reparameterized Orthogonal Equivalence Training (POET), a spectrum-preserving framework that optimizes each weight matrix through orthogonal equivalence transformation, has been proposed. Although POET provides strong training stability, its original implementation incurs high memory...
sGPO: Trading Inference FLOPs for Training Efficiency in RLVR
Announce Type: new Abstract: Standard Reinforcement Learning with Verifiable Rewards (RLVR) training allocates a fixed rollout budget to every query, without regard for what each query's difficulty means for the current policy. This leads to two symmetric failure modes: easy queries produce near-zero advantage because the policy already solves them, while unsolvable queries produce no signal because the policy never solves them. Both regimes waste training FLOPs without contributing to a...
Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy
Announce Type: new Abstract: Reinforcement Learning (RL) has emerged as a pivotal post-training paradigm, yet it frequently suffers from unpredictable sub-optimum performance or even training collapses. Recent findings attribute these failures to a hidden train-inference discrepancy (or mismatch), stemming from the disparate underlying engines and architecture. We find that the training policy can actively self-correct such a discrepancy when provided with an appropriate learning signal.
Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model
Announce Type: replace Abstract: Reinforcement learning (RL) has become essential for post-training large language models (LLMs) in reasoning tasks. While scaling rollouts can stabilize training and enhance performance, the computational overhead is a critical issue. In algorithms like GRPO, multiple rollouts per prompt incur prohibitive costs, as a large portion of prompts provide negligible gradients and are thus of low utility.
Efficient and Training-Free Single-Image Diffusion Models
Computer Science > Computer Vision and Pattern Recognition [Submitted on 3 Jun 2026] Title:Efficient and Training-Free Single-Image Diffusion Models View PDF HTML (experimental)Abstract:We consider the problem of generating images whose internal structure -- defined by the distribution of patches across multiple scales -- matches that of a single reference image. Recent approaches address this problem by training a diffusion model on a single image.
Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation
arXiv:2602.02994v3 Announce Type: replace Abstract: Reinforcement learning has emerged as a principled post-training paradigm for Temporal Video Grounding (TVG) due to its on-policy optimization, yet existing GRPO-based methods remain fundamentally constrained by sparse reward signals and substantial computational overhead. We propose Video-OPD, an efficient post-training framework for TVG inspired by recent advances in on-policy distillation. Video-OPD optimizes trajectories sampled...
MaskAlign: Token-Subset Representation Alignment for Efficient Diffusion Training
arXiv:2606.08788v1 Announce Type: new Abstract: Representation alignment with pretrained vision models has recently shown strong potential for accelerating diffusion transformer training. By aligning intermediate diffusion features with clean-image representations from self-supervised vision encoders, existing methods improve convergence and generation quality. However, such alignment also introduces a non-trivial constraint: diffusion models operate on noisy inputs whose usable information...
Efficient ASR Training with Conversations that Never Happened
arXiv:2606.03957v1 Announce Type: new Abstract: Conversational ASR for lower-resource languages and niche domains is limited by the scarcity of domain-matched multi-speaker training data. We propose an augmentation pipeline that generates scenario-level dialogues with participant metadata, maps speaker attributes to TTS voice profiles, and assembles synthesized utterances into speaker-aware simulated conversations. We evaluated five LLM families under single-generator, fixed-budget mixture,...
SlimSearcher: Training Efficiency-Aware Web Agents via Adaptive Reward Gating
Announce Type: new Abstract: Deep research agents have demonstrated remarkable capabilities in complex information-seeking tasks, yet this power comes at a steep computational cost. Driven by accuracy-focused training paradigms, current models adopt brute-force strategies characterized by blind tool dependency and performative reasoning-generating long, redundant trajectories that are far from necessary for resolving these tasks, leading to wasteful tool calls and excessive token...