Home Knowledge Base Linear Router

Linear Router

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

IR3DE: A Linear Router for Large Language Models

Announce Type: new Abstract: Foundational Large Language Models (LLMs) demonstrate proficiency on a wide range of general tasks, and achieve remarkable results on various specialized tasks via domain-expert LLMs. With the ever-growing list of available LLMs, inference routers are being proposed to select the most appropriate LLM for each prompt. However, existing routing methods either optimize cost across weak-to-strong generalist LLMs or require substantial training to support...

arXiv CS 5d ago

LayerRoute: Input-Conditioned Adaptive Layer Skipping via LoRA Fine-Tuning for Agentic Language Models

arXiv:2606.01838v1 Announce Type: new Abstract: Agentic language model systems alternate between two structurally distinct step types: structured tool calls (short, deterministic, low perplexity) and open-ended planning/reasoning steps (long, complex, high perplexity). Despite this heterogeneity, current inference systems apply identical compute to every step. We introduce LayerRoute, a lightweight adapter that learns to selectively skip transformer blocks on a per-input basis.

arXiv CS 8d ago

STAR: Rethinking MoE Routing as Structure-Aware Subspace Learning

Announce Type: new Abstract: Mixture-of-Experts (MoE) scales model capacity efficiently by selectively routing inputs to a specialized subset of experts. However, input-expert specialization, the core motivation of MoE, critically depends on whether the router is actually aware of input structure. In practice, MoE routing is typically implemented as a shallow linear projection with limited awareness of input representation, which often leads to unstable routing.

arXiv CS 1d ago

When Model Merging Breaks Routing: Training-Free Calibration for MoE

Announce Type: new Abstract: Model merging has emerged as a cost-effective approach for consolidating the capabilities of multiple LLMs without retraining. However, existing merging techniques, largely based on linear parameter arithmetic or optimization, struggle when applied to Mixture-of-Experts (MoE) architectures. We identify a critical failure mode in MoE merging, termed routing breakdown, in which the merged router fails to dispatch tokens to suitable experts.

arXiv CS 7d ago

MeshGuard: MUD-Based Network Access Control for Large-Scale Thread-Powered IoT Networks

Announce Type: new Abstract: The IETF standard Manufacturer Usage Description (MUD) enables manufacturers to equip IoT devices with certified URLs that provide traffic profiles for those devices, helping administrators enforce network access control. However, MUD assumes devices operate on full IP stacks and therefore does not account for constrained IoT devices running Thread--the dominant low-power mesh networking standard--which lacks complete TCP/IP functionality. While prior work...

arXiv CS 9d ago

Cost-Aware Query Routing in RAG: Empirical Analysis of Retrieval Depth Tradeoffs

Announce Type: new Abstract: Retrieval-augmented generation (RAG) faces a fundamental three-way tension: deeper retrieval improves factual grounding but inflates token costs and end-to-end latency. Static retrieval configurations cannot resolve this tension across heterogeneous query workloads -- simple definitional queries waste budget on unnecessary context, while complex analytical prompts are underserved by shallow retrieval. This paper introduces \emph{Cost-Aware RAG} (CA-RAG), a...

arXiv CS 7d ago

Brume is a 24-voice multi-timbral desktop synth for the CM5

FM Six operators across twelve algorithm topologies, per-op ratio and level, global feedback, a per-voice FM-index envelope, and a voice-tail state-variable filter with its own envelope — DX-style FM with subtractive shaping on the way out. A desktop multi-timbral music machine with four synthesis engines, a 10″ touch surface, and one cable to your DAW. Brume runs four synthesis engines with a shared voice tail (state-variable filter, amp envelope, modulation router), so patches stay...

Hacker News 7d ago