Home › Knowledge Base › Qwen3-32B

Qwen3-32B

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Multimodal Approaches for Visually-Rich Document Type Classification: A Comparative Analysis

arXiv:2606.02162v1 Announce Type: new Abstract: Document type classification in visually rich documents remains challenging, as relevant information is distributed across textual, visual, and layout modalities. To capture this complexity, current approaches rely on diverse multimodal modeling strategies, resulting in heterogeneous architectures that complicate systematic comparison. This variability is also reflected in existing comparative studies, which often rely on heterogeneous...

arXiv CS 8d ago

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery

arXiv:2606.02011v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) rely on long reasoning traces, making inference expensive. While low-bit quantization reduces per-token decoding cost, we show that aggressive 2-bit inference can fail to deliver end-to-end speedup because instability in the generation process inflates total token count. Instead of merely lowering answer accuracy, 2-bit quantization often produces much longer traces with repetitive loops, budget exhaustion, delayed...

arXiv CS 8d ago

REAL: Regression-Aware Reinforcement Learning for LLM-as-a-Judge

arXiv:2603.17145v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed as automated evaluators that assign numeric scores to model outputs, a paradigm known as LLM-as-a-Judge. However, standard Reinforcement Learning (RL) methods typically rely on binary rewards (e.g., 0-1 accuracy), thereby ignoring the ordinal structure inherent in regression tasks; for instance, they fail to recognize that predicting 4 is significantly better than predicting 1 when the...

arXiv CS 9d ago

Geometry-Aware Hallucination Detection in Large Language Models

Announce Type: replace Abstract: Large language models (LLMs) frequently generate factually incorrect or unsupported content, commonly referred to as hallucinations. Prior work has explored decoding strategies, retrieval augmentation, and supervised fine-tuning for hallucination detection, while recent studies show that in-context learning (ICL) can substantially influence factual reliability. However, existing ICL demonstration selection methods often rely on surface-level similarity...

arXiv CS 6d ago

What Makes Interaction Trajectories Effective for Training Terminal Agents?

arXiv:2606.03461v1 Announce Type: new Abstract: Stronger code agents are commonly assumed to be superior teachers for post-training, yet this assumption remains poorly disentangled from task difficulty, harness design, and student capacity. We investigate this pedagogical link using Terminal-Lego, a scalable pipeline that transforms multi-domain real-world issues into environment-verified agentic tasks. Surprisingly, standalone performance does not dictate teaching efficacy: while Claude...

arXiv CS 7d ago

KVarN: Native vLLM backend for KV-cache quantization by Huawei

⚡️ Built for agentic and long-context workloads. 💡 KVarN delivers 3-5x more KV-cache capacity and up to ~1.3x the throughput of FP16, so you fit far longer contexts and serve more concurrent requests, with FP16-level accuracy. 🔌 Calibration-free, plug-and-play with vLLM.

Hacker News 6d ago

MLIPilot: LLM-Driven Auto-Research for Machine-Learned Interatomic Potentials

arXiv:2605.30889v1 Announce Type: new Abstract: Constructing production-quality machine-learned interatomic potentials (MLIPs) requires balancing accuracy, dynamical stability, and computational throughput under constraints that are not captured by a single training loss. We introduce MLIPilot, an auto-research framework in which tool-calling large language models propose hypotheses, edit MLIP training code, launch HPC jobs, and accept or revert changes using a fixed, physically constrained...

arXiv Physics 9d ago

Scaling Multi-Hop Training Data via Graph-Constrained Path Selection

arXiv:2605.31238v1 Announce Type: new Abstract: Endowing large language models with compositional reasoning over specialized documents requires multi-hop training data at scale, where such data rarely exists outside of curated benchmarks built on structured sources. To construct it directly from plain, unannotated text, existing methods ask a single teacher model to jointly discover an evidence path through a document and verbalize it as a question-answer pair. However, these methods degrade...

arXiv CS 9d ago

CVE-Factory: Scaling Expert-Level Agentic Tasks for Code Security Vulnerability

Announce Type: replace Abstract: Evaluating and improving the security capabilities of code agents requires high-quality, executable vulnerability tasks. However, existing works rely on costly, unscalable manual reproduction and suffer from outdated data distributions. To address these, we present CVE-Factory, the first multi-agent framework to achieve expert-level quality in automatically transforming sparse CVE metadata into fully executable agentic tasks.

arXiv CS 9d ago

MLIPilot: LLM-Driven Auto-Research for Machine-Learned Interatomic Potentials

arXiv:2605.30889v1 Announce Type: cross Abstract: Constructing production-quality machine-learned interatomic potentials (MLIPs) requires balancing accuracy, dynamical stability, and computational throughput under constraints that are not captured by a single training loss. We introduce MLIPilot, an auto-research framework in which tool-calling large language models propose hypotheses, edit MLIP training code, launch HPC jobs, and accept or revert changes using a fixed, physically...

arXiv CS 9d ago