Home › Knowledge Base › Pretrained Models

Pretrained Models

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Dynamic Policy Learning for Legged Robot with Simplified Model Pretraining and Model-Homotopy-Inspired Transfer

arXiv:2512.24698v2 Announce Type: replace Abstract: Generating dynamic motions for legged robots remains a challenging problem. While reinforcement learning has achieved notable success in various legged locomotion tasks, producing highly dynamic behaviors often requires extensive reward tuning or high-quality demonstrations. Leveraging reduced-order models can help mitigate these challenges.

arXiv CS 6d ago

BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining

Announce Type: replace Abstract: Effective data selection is essential for pretraining large language models (LLMs), enhancing efficiency and improving generalization to downstream tasks. However, existing approaches often require leveraging external pretrained models, making it difficult to disentangle the effects of data selection from those of the external pretrained models. In addition, they often overlook the long-term impact of selected data if the model is trained to convergence,...

arXiv CS 8d ago

Data-Constrained Language Model Pretraining: Improved Regularization and Scaling Laws

Announce Type: new Abstract: Classical scaling laws for language model pretraining balance model size against training dataset size under a fixed compute budget, assuming abundant data and a single pass over the corpus. As training compute grows faster than the supply of natural language data, pretraining is likely to enter a data-constrained, compute-rich regime where models train for multiple epochs over a finite dataset. We study data-constrained pretraining along two axes, regularization...

arXiv CS 2d ago

Modular Monolingual Adaptation using Pretrained Language Models

arXiv:2606.06738v1 Announce Type: new Abstract: Building monolingual language models (LMs) for low-resource languages typically relies on adapting pretrained language models (PLMs) by finetuning the whole model on the target language. This approach is widely favored over training from scratch, as it enables effective knowledge transfer. Additionally, prior work has shown that using a language-specific tokenizer can enhance the adaptability.

arXiv CS 2d ago

Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks

arXiv:2606.01947v1 Announce Type: new Abstract: Research and applications in artificial intelligence have recently shifted with the rise of large pretrained models, which deliver state-of-the-art results across numerous tasks. However, the substantial increase in parameters introduces a need for parameter-efficient training strategies. Despite significant advancements, limited research has explored parameter-efficient fine-tuning (PEFT) methods in the context of transformer-based models for...

arXiv CS 8d ago

Speedrunning Tabular Foundation Model Pretraining

arXiv:2606.03681v1 Announce Type: new Abstract: Pretraining cost is a major bottleneck for research on tabular foundation models, slowing the iteration cycle for new architectures, priors, and optimization ideas. Yet the community lacks a simple way to compare and accumulate pretraining speedups.

arXiv CS 7d ago

Partial Identification under Missing Data Using Weak Shadow Variables from Pretrained Models

Announce Type: replace-cross Abstract: Estimating population quantities such as mean outcomes from user feedback is fundamental to platform evaluation and social science, yet feedback is often missing not at random (MNAR): users with stronger opinions are more likely to respond, so standard estimators are biased and the estimand is not identified without additional assumptions. Existing approaches typically rely on strong parametric assumptions or bespoke auxiliary variables that may be...

arXiv CS 1d ago

Elastic ViTs from Pretrained Models without Retraining

arXiv:2510.17700v2 Announce Type: replace Abstract: Vision foundation models achieve remarkable performance but are only available in a limited set of pre-determined sizes, forcing sub-optimal deployment choices under real-world constraints. We introduce SnapViT: Single-shot network approximation for pruned Vision Transformers, a new post-pretraining structured pruning method that enables elastic inference across a continuum of compute budgets. Our approach efficiently combines gradient...

arXiv CS 9d ago

LargeMonitor: Monitoring Online Task-Free Continual Learning via Large Pretrained Models

arXiv:2606.09430v1 Announce Type: new Abstract: Online task-free continual learning (TFCL) requires intelligent agents to sequentially accumulate knowledge from an unbounded, non-stationary data stream under strict single-pass constraints and without any explicit task identifiers. Existing online TFCL paradigms primarily rely on parameter-efficient prompt tuning or dynamic structure expansion driven by training-coupled optimization dynamics, such as empirical loss fluctuations or evolving...

arXiv CS 1d ago

GenFT: A Generative Parameter-Efficient Fine-Tuning Method for Pretrained Foundation Models

arXiv:2506.11042v2 Announce Type: replace Abstract: Parameter-efficient fine-tuning (PEFT) has emerged as a resource-efficient strategy for adapting Pretrained Foundation Models (PFMs) by learning a small number of task-specific updates $\Delta W$. Existing methods often learn $\Delta W$ largely independently of pretrained weights $W_0$, or exploit $W_0$ mainly through initialization or simple reparameterization. To further leverage the structural information encoded in $W_0$, we propose...

arXiv CS 5d ago