Home › Knowledge Base › Patch-Based

Patch-Based

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

PatchScene: Patch-based Voxel Diffusion for Large-Scale Scene Completion

Announce Type: new Abstract: We propose PatchScene, a novel diffusion-based framework for large-scale LiDAR scene completion. Unlike existing methods that rely on global latent representations or dense voxel grids, PatchScene adopts a patch-based voxel diffusion paradigm that explicitly generates fine-grained geometry within localized 3D regions. To ensure coherent reconstruction at both spatial and temporal scales, we introduce a confidence-guided spatio-temporal fusion mechanism that...

arXiv CS 7d ago

Spatial Artifact Coherence Determines Codec Robustness in Patch-Based rPPG

arXiv:2606.04198v1 Announce Type: new Abstract: Remote photoplethysmography (rPPG) achieves low heart-rate error on uncompressed benchmarks yet is deployed over compressed video channels in telehealth, neonatal ICU, and driver fatigue applications. No prior work identifies the physical quantity determining when spatial decomposition outperforms global-projection methods under codec compression. We propose Spatial Artifact Coherence (SAC), defined as the ratio of off-diagonal to diagonal...

arXiv CS 6d ago

Smooth Hard-Thresholding for Singular Values with Stein's Unbiased Risk Estimate

Announce Type: cross Abstract: Low-rank matrix denoising is a central primitive in patch-based image restoration and many other inverse problems. Classical SVD-based image denoising methods often choose a truncation rank by matching residual singular-value energy with an estimated noise energy, but this rule is not a finite-sample risk principle because a fitted low-rank approximation inevitably absorbs part of the noise. This paper develops a mathematically rigorous alternative based on...

arXiv CS 2d ago

GITCO: Gated Inference-Time Context Optimization in TSFMs

arXiv:2606.05332v1 Announce Type: new Abstract: Patch-based Time Series Foundation Models (TSFMs) suffer from context poisoning: structurally anomalous patches capture disproportionate attention and silently degrade zero-shot forecast quality. We propose improving TSFM accuracy at inference time by optimizing the input context rather than modifying model weights. We present GITCO (Gated Inference-Time Context Optimization), a lightweight three-component framework: Gate, Router, and Critic...

arXiv CS 5d ago

Entity-Centric World Models: Interaction-Aware Masking for Causal Video Prediction

Announce Type: replace Abstract: Learning predictive world models from unlabelled video is a foundational challenge in artificial intelligence. While Joint Embedding Predictive Architectures (JEPA) have set new benchmarks in semantic classification, they often remain physics-blind, failing to capture the causal dynamics necessary for downstream reasoning. We hypothesize that this stems from standard patch-based masking strategies, which prioritize visual texture over rare but informative...

arXiv CS 1d ago

MSTN: A Lightweight and Fast Model for General TimeSeries Analysis

arXiv:2511.20577v5 Announce Type: replace Abstract: Real-world time series often exhibit strong non-stationarity, complex nonlinear dynamics, and behavior expressed across multiple temporal scales, from rapid local fluctuations to slow-evolving long-range trends. However, many contemporary architectures impose rigid, fixed-scale structural priors such as patch-based tokenization, predefined receptive fields, or frozen backbone encoders - which can over-regularize temporal dynamics and limit...

arXiv CS 5d ago

Efficient and Training-Free Single-Image Diffusion Models

arXiv:2606.04299v1 Announce Type: new Abstract: We consider the problem of generating images whose internal structure -- defined by the distribution of patches across multiple scales -- matches that of a single reference image. Recent approaches address this problem by training a diffusion model on a single image. But even in this setting, training is computationally expensive and requires hours of optimization.

arXiv CS 6d ago

Efficient and Training-Free Single-Image Diffusion Models

Computer Science > Computer Vision and Pattern Recognition [Submitted on 3 Jun 2026] Title:Efficient and Training-Free Single-Image Diffusion Models View PDF HTML (experimental)Abstract:We consider the problem of generating images whose internal structure -- defined by the distribution of patches across multiple scales -- matches that of a single reference image. Recent approaches address this problem by training a diffusion model on a single image.

Hacker News 3d ago

TLDR: Compressing Audio Tokens for Efficient Autoregressive Text-to-Speech

arXiv:2606.09019v1 Announce Type: new Abstract: Codec-based autoregressive (AR) speech language models have achieved strong text-to-speech (TTS) quality by modeling speech as sequences of discrete audio tokens with large pretrained backbones. However, this token-level formulation creates a structural efficiency bottleneck: speech-token sequences are much longer than text sequences, requiring the AR backbone to perform causal computation at every token position and maintain a KV cache that...

arXiv CS 1d ago

Unmixing ATR-{\mu}FTIR spectroscopic images of cross-sections of historical oil paintings

arXiv:2603.06673v2 Announce Type: replace Abstract: Spectroscopic imaging (SI) has become central to heritage science because it enables non-invasive, spatially resolved characterisation of materials in artefacts. In particular, attenuated total reflection Fourier transform infrared microscopy (ATR-$\mu$FTIR) is widely used to analyse painting cross-sections, where a spectrum is recorded at each pixel to form a hyperspectral image (HSI).

arXiv CS 2d ago