Home › Knowledge Base › Adaptive Weighting

Adaptive Weighting

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting

Announce Type: replace Abstract: On-policy reinforcement learning has become the dominant paradigm for reasoning alignment in large language models, yet its sparse, outcome-level rewards make token-level credit assignment notoriously difficult. On-Policy Distillation (OPD) alleviates this by introducing dense, token-level KL supervision from a teacher model, but typically applies this supervision uniformly across all rollouts, ignoring fundamental differences in signal quality. We propose...

arXiv CS 8d ago

Decision-Path Patterns as Tree Reliability Signals: Path-based Adaptive Weighting for Random Forest Classification

arXiv:2605.20716v5 Announce Type: replace Abstract: Random forests construct each tree with a different, randomised representation of the feature space. Their uniform voting cannot correct errors in regions where trees with incorrect representations probabilistically outnumber correct ones, even when the ensemble collectively holds enough correct information - a reducible error that this paper addresses. We propose using the structural pattern of each tree's decision path as an...

arXiv CS 8d ago

Robotic Policy Adaptation via Weight-Space Meta-Learning

Announce Type: new Abstract: Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling general-purpose policies trained from large corpora of demonstrations and action labels. However, adapting these models to new tasks still typically requires task-specific demonstrations, action annotations, and additional fine-tuning, making deployment costly and difficult to scale. We propose WIZARD, a weight-space meta-learning framework that sidesteps...

arXiv CS 2d ago

Theoretical Analysis of Sparse Optimization with Reparameterization, Weight Decay, and Adaptive Learning Rate

arXiv:2605.25134v3 Announce Type: replace Abstract: Sparse optimization is a fundamental challenge in various practical applications. A popular approach to sparse optimization is $\ell_p$ regularization. However, it may encounter optimization instability due to the unbounded gradients when $0<1$.

arXiv CS 9d ago

STaR-KV: Spatio-Temporal Adaptive Re-weighting for KV Cache Compression in GUI Vision-Language Models

Announce Type: new Abstract: Vision-language-model-based graphical user interface (GUI) agents have shown broad automation capabilities, yet deployment is bottlenecked by a key-value (KV) cache that grows linearly with interaction steps. For instance, UI-TARS-1.5-7B consumes 76 GB of GPU memory on merely five screenshots, approaching the capacity of mainstream 80 GB accelerators. Existing KV compression methods share two structural assumptions: aggregating visual-token importance into a...

arXiv CS 8d ago

LLMCodec: Adapting Video Codecs for Efficient Weight Compression of Large Language Models

Announce Type: new Abstract: The rapid development of large language models(LLMs) has led to remarkable advances in natural language processing. However, the increasing scale of these models introduces substantial challenges in terms of storage, transmission, and deployment. Though great efforts have been devoted to model compression and quantization, existing methods often rely on fine-tuning or calibration data, which exhibit limited generalization across different tensor types.

arXiv CS 5d ago

AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization

arXiv:2605.08692v2 Announce Type: replace Abstract: Post-training weight-only quantization to 4 bits is widely used to reduce the memory and compute costs of large language model inference. Existing PTQ methods, such as AWQ and GPTQ, improve how weights are mapped onto a fixed 4-bit grid through scaling, clipping, or error compensation. To further improve accuracy, methods such as OmniQuant and QuIP\# uses gradient-assisted algorithms at the cost of hours of quantization time.

arXiv CS 2d ago

LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents

Announce Type: replace Abstract: Role specialization in multi-LLM agent systems is often realized via multi-LoRA, where agents share a pretrained backbone and differ only by lightweight adapters. Despite sharing base model weights, each agent independently builds and stores its own KV cache for the same long, tool-augmented trajectories, incurring substantial memory and compute overhead. Existing KV cache sharing methods largely overlook this multi-LoRA setting.

arXiv CS 8d ago

TaDA: Calibrated Probe Gating for Task-Domain LoRA Merging

arXiv:2606.05016v1 Announce Type: new Abstract: Combining a task LoRA adapter with a domain LoRA adapter into a single unified model is a practical yet largely unexplored challenge. Existing methods treat both adapters as symmetric peers, applying uniform weights across all layers.

arXiv CS 6d ago

Balanced LoRA: Removing Parameter Invariance to Accelerate Convergence

Announce Type: new Abstract: Low-Rank Adaptation (LoRA) is the most widely adopted method for fine-tuning large language models. Notably, LoRA is inherently overparameterized: multiple pairs of low-rank factors can yield the same adapted weight matrix. We show--both theoretically and empirically--that these pairs exhibit significantly different condition numbers.

arXiv CS 9d ago