the Theoretical Limitations of Embedding
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
On the Theoretical Limitations of Embedding-based Link Prediction
Announce Type: replace Abstract: Neural networks often map low-dimensional embeddings to high-dimensional output spaces. Usually, the output layer is linear, which can create a "rank bottleneck" that limits the functions a model can represent.
$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval
Announce Type: replace Abstract: This paper studies the Minimal Embeddable Dimension (MED): the least dimension in which there exists a configuration of $m$ object vectors so that every subset of size at most $k$ is exactly retrieved by score comparison. Our result shows MED is $\Theta(k)$, independent of $m$, for inner product, Euclidean distance, and cosine similarity. We then consider Robust MED (RMED), where all vectors are unit normed and an $\epsilon$ gap of scores is required.
Generative Quantum Data Embeddings for Supervised Learning
Announce Type: cross Abstract: Many practically relevant applications of quantum machine learning involve classical data, for which performance depends critically on how inputs are embedded into quantum states. Yet the use of a fixed embedding circuit ansatz remains standard practice.
ColBERTSaR: Sparsified ColBERT Index via Product Quantization
Announce Type: new Abstract: While ColBERT is an effective neural retrieval architecture, it requires a heavy index structure to support candidate set retrieval based on approximated token embeddings, gathering and decompressing document token embeddings, and applying the MaxSim operation. Indexes in PLAID and similar ColBERT implementations require five to ten times the disk storage of the original raw text, which limits their scalability. Furthermore, prior work has identified that the...
GJDNet: Robust Graph Neural Networks via Joint Disentangled Learning Against Adversarial Attacks
arXiv:2606.01560v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) are vulnerable to adversarial attacks, which inherently invert connectivity patterns by introducing disassortative edges in assortative graphs and assortative edges in disassortative graphs. This structural inversion creates structure-feature mismatches that disrupt neighborhood aggregation across different graph types.
Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding
arXiv:2606.01258v1 Announce Type: new Abstract: Standard positional encodings for transformers - sinusoidal and rotary (RoPE) - treat every position as equally local: they encode where a token is, but not how far its positional influence should extend. We propose that the Morlet wavelet, which simultaneously minimises uncertainty in position and frequency, is the natural basis for positional encoding, and introduce Morlet Positional Encoding (MoPE): each embedding dimension learns its own...
SpikeWFM: Spiking-Aided Wireless Foundation Model for Robust Channel Prediction
Announce Type: cross Abstract: This paper proposes SpikeWFM, a novel hybrid architecture that integrates spiking neural networks (SNNs) with conventional artificial neural network (ANN)-based transformers for wireless foundation models (WFMs). Inspired by the noise-robust and energy-efficient information processing in the human brain, SpikeWFM aims to enhance the resilience of WFMs against noise and interference while maintaining strong generalization capabilities across diverse wireless...
Beyond Semantic Understanding: Preserving Collaborative Frequency Components in LLM-based Recommendation
arXiv:2508.10312v2 Announce Type: replace Abstract: Recommender systems in concert with Large Language Models (LLMs) present promising avenues for generating semantically-informed recommendations. However, LLM-based recommenders exhibit a tendency to overemphasize semantic correlations within users' interaction history. When taking pretrained collaborative ID embeddings as input, LLM-based recommenders progressively weaken the inherent collaborative signals as the embeddings propagate...
InfoShield: Privacy-Preserving Speech Representations for Mental Health Screening via Information-Theoretic Optimization
arXiv:2606.05561v1 Announce Type: new Abstract: Speech-based mental health screening offers scalable depression detection, yet clinical deployment faces a significant barrier: users' privacy concerns about demographic information exposure. Current techniques struggle to resolve this conflict. Adversarial training often fails against unseen threats, whereas Differential Privacy tends to compromise diagnostic performance by injecting noise across all features.
SteerFace: Debiasing Synthetic Face Generation via Adaptive Residue Perturbation
arXiv:2605.30894v1 Announce Type: new Abstract: The shortage of legally compliant data for face recognition training has sparked growing interest in using synthetic data as an alternative. While recent diffusion-based methods enable the generation of photorealistic face images with strong identity adherence and data diversity, their downstream recognition performance still exhibits a significant synthetic-real gap. This paper identifies visual tendency as a previously underexplored...