Home Knowledge Base LMETRIC

LMETRIC

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Simple is Better: Multiplication May Be All You Need for LLM Request Scheduling

Announce Type: replace Abstract: High-quality LLM request scheduling requires meeting two key objectives: ensuring the routed instance has KVCache to accelerate request execution, and ensuring that the workload is balanced across instances. Achieving both objectives is challenging because pursuing one may compromise the other. Current approaches use various combinators (e.g., linear combinations) to compute a scheduling score that combines indicators for the two objectives.

arXiv CS 2d ago