Transformative
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Inverting Data Transformations via Diffusion Sampling
arXiv:2602.08267v2 Announce Type: replace Abstract: We study the problem of transformation inversion on general Lie groups: a datum is transformed by an unknown group element, and the goal is to recover an inverse transformation that maps it back to the original data distribution. Such unknown transformations arise widely in machine learning and scientific modeling, where they can significantly distort observations. We take a probabilistic view and model the posterior over transformations as...
ChWDTA: Channel-wise Wavelet-Domain Transformer Attention and Entropy Modeling for Learned Image Compression
arXiv:2606.00111v1 Announce Type: cross Abstract: State-of-the-art learned image compression (LIC) schemes are increasingly based on hybrid CNN-transformer architectures. To further improve rate-distortion performance, we introduce channel-wise wavelet transforms into both the transformer and entropy-coding components. First, we propose a channel-wise wavelet-domain transformer attention (ChWDTA) mechanism.
Discovering Interpretable Algorithms by Decompiling Transformers to RASP
arXiv:2602.08857v2 Announce Type: replace Abstract: Recent work has shown that the computations of Transformers can be simulated in the RASP family of programming languages. These findings have enabled improved understanding of the expressive capacity and generalization abilities of Transformers. In particular, Transformers have been suggested to length-generalize exactly on problems that have simple RASP programs.
Fast Transformer Inference on ARM-Based HMPSoCs
arXiv:2606.02836v1 Announce Type: new Abstract: Transformer models have set new performance standards for machine learning (ML) tasks. However, their resource-intensive deployment on resource-constrained edge devices for cloud-free, on-chip transformer inference remains challenging. The ARM Compute Library (ARM-CL) framework provides low-latency CNN inference on ARM-based edge devices but lacks support for transformer inference.
Semi-local transformation for compressible wall turbulence via elliptic equations
arXiv:2606.09322v1 Announce Type: new Abstract: Compressibility and wall heat transfer change the inner scaling of wall turbulence through the mean density and viscosity fields. Existing semi-local transformations usually act on a wall-normal profile after the profile has been chosen. Here the transformed coordinate and transformed velocity are instead defined by elliptic equations before wall-normal profiles are extracted.
Quantum Kravchuk Transform using $\mathfrak{su}(2)$ fast-forwarding
arXiv:2606.08443v1 Announce Type: cross Abstract: We present a quantum algorithm for the Kravchuk transform that scales logarithmically in both the dimension and the inverse of the error parameter. The quantum Kravchuk transform maps computational basis states to states with amplitudes proportional to Kravchuk functions. We achieve this by combining two key techniques: the structural relationship between the Kravchuk transform and the Lie algebras $\mathfrak{su}(2)$, and a recent...
Platonic Transformers: A Solid Choice For Equivariance
arXiv:2510.03511v3 Announce Type: replace Abstract: While widespread, Transformers lack inductive biases for geometric symmetries common in science and computer vision. Existing equivariant methods often sacrifice the efficiency and flexibility that make Transformers so effective through complex, computationally intensive designs. We introduce the Platonic Transformer to resolve this trade-off.
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
Announce Type: replace Abstract: As the foundational architecture of modern machine learning, Transformers have driven remarkable progress across diverse AI domains. Despite their transformative impact, a persistent challenge across various Transformers is Attention Sink (AS), in which a disproportionate amount of attention is focused on a small subset of specific yet uninformative tokens. AS complicates interpretability, significantly affecting the training and inference dynamics, and...
Fixed Universal Transformers
arXiv:2605.31423v1 Announce Type: new Abstract: We introduce \emph{universal transformers}: fixed transformers that can simulate any transformer in a given class via a suitable input embedding. Analogous to a universal Turing machine, the input embedding encodes a description of the target model while all internal parameters remain fixed. We provide explicit sparse constructions achieving universality when the embedding dimension is sufficiently large, and further show that universality is...
D\'ej\`a View: Looping Transformers for Multi-View 3D Reconstruction
arXiv:2605.30215v2 Announce Type: replace Abstract: Recent feed-forward 3D reconstruction transformers have scaled to over a billion parameters, following the broader trend of increasing model capacity in computer vision. Yet emerging evidence suggests that contiguous transformer layers often behave like repeated applications of similar operations, and multi-view reconstruction transformers refine their predictions progressively across decoder depth. We posit that model depth partially buys...