Science
Kernel Affine Hull Machines as Compute-Efficient Encoders for Frozen Semantic Spaces
Key Points
Announce Type: replace Abstract: Transformer-based semantic encoders are effective for retrieval, but in many deployments the recurring bottleneck is online query encoding rather than offline corpus indexing. This paper studies whether, once a strong teacher representation space and corpus index are fixed, repeated neural query encoding can be replaced by a substantially lighter and analytically explicit estimator. We formulate fixed-teacher lexical-to-semantic encoding as a conditional-mean...
arXiv:2605.02950v2 Announce Type: replace
Abstract: Transformer-based semantic encoders are effective for retrieval, but in many deployments the recurring bottleneck is online query encoding rather than offline corpus indexing. This paper studies whether, once a strong teacher representation space and corpus index are fixed, repeated neural query encoding can be replaced by a substantially lighter and analytically explicit estimator. We formulate fixed-teacher lexical-to-semantic encoding as a conditional-mean estimation problem in which the target semantic vector is represented as a noisy mixture of semantic prototypes weighted by posterior cluster probabilities. Kernel Affine Hull Machine (KAHM) geometry is used to estimate these posterior weights from inexpensive lexical features in an explicitly identified RKHS hypothesis space, and the semantic prototypes are refined by normalized least-mean-squares updates from noisy teacher embeddings. This yields a backpropagation-free query-side encoder together with an end-to-end error decomposition into posterior-approximation, finite-sample/generalization, and teacher-noise terms. We instantiate the approach on a controlled Austrian-law retrieval benchmark with 5,000 test queries, 84 candidate laws, and 10,762 aligned retrieval units, using law-specific encoders into a frozen Mixedbread embedding space. Among evaluation-matched learned adapters, KAHM achieves the strongest teacher-space reconstruction and the best rank-sensitive retrieval performance at all evaluated cutoffs. At k=20, it obtains MRR@20 = 0.504, Hit@20 = 0.694, and Top-1 Accuracy = 0.411, while reducing online per-query time by 8.53 relative to direct transformer query encoding in the reported CPU setting. The results support KAHMs as compute-efficient encoders for supervised fixed-representation deployment regimes.