DiffuSent
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Diffusing in the Right Space: A Systematic Study of Latent Diffusability
arXiv:2606.03578v1 Announce Type: new Abstract: Latent diffusion models leverage visual tokenizers to compress images into latent spaces for efficient generative modeling. However, better reconstruction quality of a tokenizer does not necessarily translate into better generation quality, suggesting that latent representations should be evaluated not only by fidelity but also by their diffusability. Recent studies have proposed diverse explanations for diffusion-friendly latent spaces,...
Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies
arXiv:2508.20072v4 Announce Type: replace Abstract: Vision-Language-Action (VLA) models adapt large vision-language backbones to map images and instructions into robot actions. However, prevailing VLAs either generate actions autoregressively in a fixed left-to-right order with poor performance or attach separate diffusion heads outside the backbone that fragments information pathways and hinders unified, scalable architectures. Instead, we present Discrete Diffusion VLA that discretizes...
PTL-Diffusion: Manifold-Aware Diffusion with Periodic Terminal Laws
arXiv:2606.09816v1 Announce Type: new Abstract: Standard diffusion models typically use a single time-homogeneous Gaussian terminal distribution as the reference law for generation. While this choice is analytically convenient and empirically powerful, it provides little explicit structure for data concentrated near low-dimensional manifolds, where different regions of the data distribution may correspond to distinct local geometric or semantic factors. As a result, the reverse model must...
Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models
arXiv:2503.08434v5 Announce Type: replace Abstract: Recent advances in large-scale text-to-image models have revolutionized creative fields by generating visually captivating outputs from textual prompts; however, while traditional photography offers precise control over camera settings to shape visual aesthetics - such as depth-of-field via aperture - current diffusion models typically rely on prompt engineering to mimic such effects. This approach often results in crude approximations and...
Wavelet Fourier Diffuser: Frequency-Aware Diffusion Model for Reinforcement Learning
Announce Type: replace Abstract: Diffusion probability models have shown significant promise in offline reinforcement learning by directly modeling trajectory sequences. However, existing approaches primarily focus on time-domain features while overlooking frequency-domain features, leading to frequency shift and degraded performance according to our observation. In this paper, we investigate the RL problem from a new perspective of the frequency domain.
Set-Supervised Diffusion Policy: Learning Action-Chunking Diffusion through Corrections
arXiv:2606.01865v1 Announce Type: new Abstract: Diffusion policies have recently emerged as a powerful framework for robotic manipulation. However, like other behavior cloning methods, they remain vulnerable to distributional shift, often requiring human-in-the-loop interventions to correct failures during deployment. These interactions naturally provide paired supervision in the form of the robot's undesired actions and the human teacher's corrective actions.
Latent Diffusion Policy: Shaping Latent Spaces for Diffusion-Based Robotic Manipulation
Announce Type: new Abstract: Diffusion-based visuomotor policies operating directly in raw action spaces conflate scene comprehension with trajectory generation within a single denoising process. The resulting velocity field must simultaneously encode scene information and generate precise trajectories, increasing learning complexity and limiting performance on tasks demanding precise temporal coordination across multiple arms. To simplify this joint learning problem, we introduce Latent...
Transformed Diffusion-Wave fPINNs: Enhancing Computing Efficiency for PINNs Solving Time-Fractional Diffusion-Wave Equations
arXiv:2506.11518v2 Announce Type: replace Abstract: We propose transformed Diffsuion-Wave fractional Physics-Informed Neural Networks (tDWfPINNs) for efficiently solving time-fractional diffusion-wave equations with fractional order $\alpha\in(1,2)$. Conventional numerical methods for these equations often compromise the mesh-free advantage of Physics-Informed Neural Networks (PINNs) or impose high computational costs when computing fractional derivatives. The proposed method avoids...
DiffuSent: Towards a Unified Diffusion Framework for Aspect-Based Sentiment Analysis
arXiv:2606.01323v1 Announce Type: new Abstract: Aspect-Based Sentiment Analysis (ABSA) encompasses seven distinct subtasks, each focusing on different extracted elements. Despite the proven success of generative models in unified aspect sentiment analysis, existing approaches often rely on auto-regressive token-by-token generation without grasping the whole information of the aspect and opinion terms, resulting in boundary insensitivity, particularly in context of multi-word aspect and...
Ultra Diffusion Poser: Diffusion-Based Human Motion Tracking From Sparse Inertial Sensors and Ranging-Based Between-Sensor Distances
arXiv:2606.02153v1 Announce Type: new Abstract: Methods using inertial measurement units (IMUs) provide a wearable alternative to camera-based motion capture. To mitigate drift from inertial signals, recent sparse inertial pose estimators integrate inter-sensor distances measured by ultra-wideband (UWB) ranging. So far, UWB distances have only been used as an additional input feature, ignoring the physical constraints they impose on sensor positions.