Home Knowledge Base Hyperparameter Optimization

Hyperparameter Optimization

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Efficient Hyperparameter Optimization for LLM Reinforcement Learning

arXiv:2606.03073v1 Announce Type: new Abstract: Reinforcement learning (RL) for large language models (LLMs) is highly sensitive to hyperparameter configurations, making hyperparameter optimization (HPO) essential yet computationally expensive. Existing multi-fidelity HPO methods remain inefficient for LLM RL due to the massive model scale and resource-intensive training cycles. In this paper, we propose Joint Fidelity Hyperparameter Optimization (JF-HPO), which simultaneously adapts both...

arXiv CS 7d ago

c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization

arXiv:2211.14411v5 Announce Type: replace Abstract: Hyperparameter optimization (HPO) is crucial for strong performance of deep learning algorithms and real-world applications often impose some constraints, such as on memory usage or latency, on top of the performance requirement. In this work, we propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE), to handle these constraints.

arXiv CS 8d ago

Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training

arXiv:2606.05610v1 Announce Type: new Abstract: The efficacy of continued pre-training for Large Language Models (LLMs) hinges upon hyperparameter configurations, such as learning rate and batch size. However, current practices often rely on heuristics or grid searches, leading to training instability and excessive costs. In this work, we first empirically discover that optimal hyperparameters follow stable and predictable scaling laws throughout the continued pre-training process.

arXiv CS 5d ago

Provably Reduced Sample Cost in Prior-Guided Hyperparameter Optimization

arXiv:2606.04866v1 Announce Type: new Abstract: Large-scale hyperparameter optimization (HPO) in automated machine learning (AutoML) consumes substantial computational resources, raising growing concerns about scalability and energy efficiency. Existing methods use prior information heuristically to accelerate both black-box and multi-fidelity settings, but they lack a characterization of how prior informativeness quantitatively reduces sample complexity. In this work, we provide the first...

arXiv CS 6d ago

Can LLMs Beat Classical Hyperparameter Optimization Algorithms?

Computer Science > Machine Learning [Submitted on 25 Mar 2026 (v1), last revised 17 Apr 2026 (this version, v5)] Title:Can LLMs Beat Classical Hyperparameter Optimization Algorithms?

Hacker News 1d ago

BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning

Announce Type: new Abstract: The rapid advancement of high-throughput sequencing has led to large, high-dimensional omics datasets. Deep unsupervised learning architectures, particularly Autoencoders (AEs), are increasingly used for dimensionality reduction and representation learning in this domain. However, AEs are highly sensitive to architectural choices and hyperparameters, and unsupervised optimization typically relies on reconstruction loss, which may be a poor proxy for downstream...

arXiv CS 6d ago

LLM-Guided ANN Index Optimization for Human-Object Interaction Retrieval

Announce Type: new Abstract: Retrieval systems underpin modern AI applications -- spanning visual search, recommendation engines, and multi-modal question answering. Modern multi-stage retrieval systems require the joint optimization of highly coupled parameters, yet traditional hyperparameter optimization (HPO) methods -- including Tree-structured Parzen Estimators (TPE) and Gaussian Process Bayesian Optimization -- rely on an independence assumption that fundamentally prevents them from...

arXiv CS 5d ago

Iterated Population Based Training with Task-Agnostic Restarts

Announce Type: replace Abstract: Hyperparameter Optimization (HPO) can lift the burden of tuning hyperparameters (HPs) of neural networks. HPO algorithms from the Population Based Training (PBT) family are efficient thanks to dynamically adjusting HPs every few steps of the weight optimization.

arXiv CS 8d ago

Weight Decay Improves Language Model Plasticity

arXiv:2602.11137v2 Announce Type: replace Abstract: Large language models are typically trained in two broad phases: pretraining to produce a base model, followed by further training to improve downstream performance. However, hyperparameter optimization and scaling laws are studied primarily from the perspective of the base model's validation loss, overlooking a crucial model property: downstream adaptability. In this work, we study pretraining from the perspective of model plasticity, that...

arXiv CS 9d ago

S$^3$LDBO: A Snapshot Single-Loop Algorithm for Decentralized Bilevel Optimization

arXiv:2605.31311v1 Announce Type: cross Abstract: Networked AI systems increasingly rely on multiple agents that collaboratively learn and adapt models over communication networks. In such systems, bilevel formulations naturally arise in hyperparameter optimization, data cleaning, and meta-learning, but the repeated evaluation of gradients, Jacobians, and Hessians can impose a substantial computational burden on individual agents. To address this challenge, we propose Snapshot-SLDBO...

arXiv CS 9d ago