Key/Value
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Show HN: Keybench – Scriptable, extensible performance tool for key value stores
guycipher/keybench Folders and files | Name | Name | Last commit date | || |---|---|---|---|---| Repository files navigation keybench ======== A scriptable, extensible performance tool for sorted key value stores.
HKVM-RAG: Key-Value-Separated Hypergraph Evidence Organization for Multi-Hop RAG
arXiv:2606.07218v1 Announce Type: new Abstract: Multi-hop RAG poses a data-engineering problem beyond passage matching: under fixed retrieval budgets, a system must organize retrieved text into evidence units that expose answer chains. Dense retrievers score passages independently, while graph-based memories make associations explicit but often rely on pairwise or entity-centered keys that fragment multi-hop evidence. We present HKVM-RAG, a key-value-separated evidence-organization layer.
Q-Delta: Beyond Key-Value Associative State Evolution
Announce Type: new Abstract: Linear attention reformulates sequence modeling as recurrent state evolution, enabling efficient linear-time inference. Under the key-value associative paradigm, existing approaches restrict the role of the query to the readout operation, decoupling it from state evolution. We show that query-conditioned state readout induces a structured value prediction over accumulated memory that complements key-based retrieval.
HACK++: Towards More Effective Head-Aware Key-Value Compression for Efficient Visual Autoregressive Modeling
Announce Type: new Abstract: Visual Autoregressive (VAR) models adopt a next-scale prediction paradigm, offering high-quality generation with substantially fewer decoding steps. However, existing VAR models suffer from significant attention complexity and severe memory overhead due to the accumulation of key-value (KV) caches across scales. In this paper, we tackle this challenge by introducing KV cache compression into the next-scale paradigm.
Telangana land value rate revision takes effect: Key changes explained
The Telangana government has implemented revised land values for property registrations across the state from June 5, marking the first major revision in nearly a year. The new rates apply to both agricultural and non-agricultural properties. According to the government, the revision is aimed at narrowing the gap between officially notified registration values and prevailing market prices.
ART: Attention Run-time Termination for Efficient Large Language Model Decoding
arXiv:2606.00024v2 Announce Type: replace Abstract: Long-context decoding in Large Language Models (LLMs) is constrained by the cost of accessing and processing the Key-Value (KV) cache. Despite the evidence that attention outputs depend jointly on keys and values, most existing KV management methods rely on key-only pruning, as incorporating values incurs prohibitive additional overhead. In this paper, we propose Attention Run-time Termination (ART), a lightweight run-time mechanism that...
Do Transformers Need Three Projections? Systematic Study of QKV Variants
arXiv:2606.04032v2 Announce Type: replace Abstract: Transformers have become the standard solution for various AI tasks, with the query, key, and value (QKV) attention formulation playing a central role. However, the individual contribution of these three projections and the impact of omitting some remain poorly understood. We systematically evaluate three projection sharing constraints: a) Q-K=V (shared key-value), b) Q=K-V (shared query-key), and c) Q=K=V (single projection).
Do Transformers Need Three Projections? Systematic Study of QKV Variants
Announce Type: new Abstract: Transformers have become the standard solution for various AI tasks, with the query, key, and value (QKV) attention formulation playing a central role. However, the individual contribution of these three projections and the impact of omitting some remain poorly understood. We systematically evaluate three projection sharing constraints: a) Q-K=V (shared key-value), b) Q=K-V (shared query-key), and c) Q=K=V (single projection).
Don't be so Stief! Learning KV Cache low-rank approximation over the Stiefel manifold
Announce Type: replace Abstract: Key-value (KV) caching enables fast autoregressive decoding but at long contexts becomes a dominant bottleneck in High Bandwidth Memory (HBM) capacity and bandwidth. A common mitigation is to compress cached keys and values by projecting per-head matrices to a lower rank, storing only the projections in the HBM. However, existing post-training approaches typically fit these projections using SVD-style proxy objectives, which may poorly reflect end-to-end...
OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
Announce Type: replace Abstract: Large language models (LLMs) with extended context windows enable powerful applications but impose significant memory overhead, as caching all key-value (KV) states scales linearly with sequence length and batch size. Existing cache eviction methods address this by exploiting attention sparsity, yet they typically rank tokens heuristically using accumulated attention weights without considering their true impact on attention outputs. We propose Optimal Brain...