Home Knowledge Base Small Language Model

Small Language Model

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Model Multiplicity for Adversarial Detection in Small Language Model Training on Edge Devices

arXiv:2606.07857v1 Announce Type: new Abstract: The rise of edge-based machine learning has enabled distributed adaptation of language models across mobile and IoT devices, offering privacy preservation and real-time responsiveness. However, distributed fine-tuning of language models on untrusted or heterogeneous edge nodes introduces new vulnerabilities.

arXiv CS 1d ago

SLMJury: Can Small Language Models Judge as Well as Large Ones?

arXiv:2606.07810v1 Announce Type: new Abstract: Large language models (LLMs) are widely used as judges for evaluating model outputs, but their high cost, latency, and opacity limit scalability. We introduce SLMJury, a framework for evaluating small language models (SLMs) as judges across two paradigms: closed-ended binary correctness and open-ended quality scoring. We benchmark 16 SLM judges (0.6B-14B parameters) from four model families across ten benchmarks: eight closed-ended tasks...

arXiv CS 1d ago

ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference

arXiv:2606.01806v1 Announce Type: new Abstract: Small Language Models (SLMs) offer a balance between capability and computational feasibility. Neural scaling laws inform their optimal training, suggesting that they possess rich internal representations that scale with their size. However, deploying even these SLMs can be challenging under strict resource constraints.

arXiv CS 8d ago

Learning What to Learn: Stage-Specific Data Sets for SFT-then-RL in Small Language Model Reasoning

Announce Type: new Abstract: Post-training Small Language Models (SLMs) for reasoning typically follows an SFT-then-RL pipeline, yet existing work rarely considers what data should be learned at each stage. We argue that data strategy should be aligned with the distinct roles of SFT and RL: SFT is better suited for acquiring not-yet-mastered reasoning skills, while RL is better suited for consolidating skills that the model can already partially access. Based on this principle, we propose a...

arXiv CS 6d ago

Domain-Adapted Small Language Models with Hybrid Post-Processing: Achieving Cost-Efficient, Low-Latency Multi-Label Structured Prediction via LoRA Fine-Tuning on Scarce Data

arXiv:2606.05781v2 Announce Type: replace Abstract: Deploying frontier large language models (LLMs) for domain-specific structured evaluation tasks incurs prohibitive latency, cost, and data-privacy overhead. We present a hybrid framework that fine-tunes a small language model (LLaMA 3.1 8B, 2.05% trainable parameters via LoRA) on only 219 curated examples and couples it with a deterministic rule-based postprocessing layer. Applied to multi-label compliance evaluation of conversational...

arXiv CS 1d ago

Efficient Skill Grounding via Code Refactoring with Small Language Models

Announce Type: new Abstract: Effective skill grounding is essential for deploying reusable skills in embodied agents, as even minor embodiment or environmental differences can render an entire skill incompatible. This challenge is particularly pronounced in embodied settings, where agents must operate in dynamic, partially observable environments without access to large language models (LLMs). In this setting, reliance on LLMs is impractical, while small language models (sLMs) remain...

arXiv CS 1d ago

Domain-Adapted Small Language Models with Hybrid Post-Processing: Achieving Cost-Efficient, Low-Latency Multi-Label Structured Prediction via LoRA Fine-Tuning on Scarce Data

arXiv:2606.05781v1 Announce Type: new Abstract: Deploying frontier large language models (LLMs) for domain-specific structured evaluation tasks often incurs substantial latency, cost, and data privacy overhead. We present a hybrid framework that combines a fine-tuned small language model (LLaMA 3.1 8B, with only 2.05% trainable parameters via LoRA) and a deterministic rule-based post-processing layer. Trained on just 219 curated examples, the system is applied to multi-label compliance...

arXiv CS 5d ago

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models

Announce Type: replace Abstract: Vision-Language-Action (VLA) models, which integrate pretrained large Vision-Language Models (VLM) into their policy backbone, are gaining significant attention for their promising generalization capabilities. This paper revisits a fundamental yet seldom systematically studied question: how VLM choice and competence translate to downstream VLA policies performance? We introduce VLM4VLA, a minimal adaptation pipeline that converts general-purpose VLMs into VLA...

arXiv CS 8d ago

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

Announce Type: replace Abstract: Language models have consistently grown to compress more world knowledge into their parameters, but the knowledge that can be pretrained into them is upper-bounded by their parameter size. Especially the capacity of Small Language Models (SLMs) is limited, leading to factually incorrect generations. This problem is often mitigated by giving the SLM access to an outside source: the ability to query a larger model, documents, or a database.

arXiv CS 9d ago

T1: Tool-integrated Verification for Test-time Compute Scaling in Small Language Models

arXiv:2504.04718v2 Announce Type: replace Abstract: Recent studies have demonstrated that test-time compute scaling effectively improves the performance of small language models (sLMs). However, prior research has mainly examined test-time compute scaling with an additional larger model as a verifier, leaving verification by sLMs underexplored. In this work, we investigate whether sLMs can reliably verify the output candidates under test-time scaling.

arXiv CS 8d ago