Home Knowledge Base Result Visualization Module

Result Visualization Module

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Intrinsic Population Dynamics are a Neuronal Substrate for Visual Attention

Perception results from a dynamic interplay between the feedforward processing of sensory stimuli and intrinsic neural activity, which is often dismissed as noise. To tailor perceptual processes to the organism's current needs on a continuous, moment-to-moment basis, intrinsic dynamics - rather than just being noise - have been suggested to reflect prior expectations, task demands, and attentional focus. Here, we identify a novel signature of attentive state in which intrinsic, collective...

bioRxiv 6d ago

Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting

arXiv:2606.03792v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) successfully enables personalization in text-to-image generation by adapting pre-trained diffusion models to specific visual concepts and styles. However, extending such models to multi-concept customization remains challenging. Naively combining multiple LoRA weights or their outputs often leads to interference among concepts, resulting in degraded visual quality and reduced fidelity to the reference images of...

arXiv CS 7d ago

OpenCompass: A Universal Evaluation Platform for Large Language Models

arXiv:2605.19276v3 Announce Type: replace Abstract: In recent years, the field of artificial intelligence has undergone a paradigm shift from task-specific small-scale models to general-purpose large language models (LLMs). With the rapid iteration of LLMs, objective, quantitative, and comprehensive evaluation of their capabilities has become a critical link in advancing technological development. Currently, the mainstream static benchmark dataset-based evaluation methods face challenges...

arXiv CS 1d ago

SpongeBob: Sync-Aware Harmonious Audio-Visual Generative Editing

arXiv:2605.25193v2 Announce Type: replace Abstract: Visual and acoustic events in the physical world are inherently coupled, yet existing video editing methods typically adopt decoupled pipelines, lacking bidirectional modality interaction. This results in two key limitations: (i) audio-visual desynchronization and (ii) contextual conflicts between generated audio and preserved content. To address these, we propose SpongeBob, the first end-to-end audio-visual joint editing framework...

arXiv CS 9d ago

CARES: Context-Aware Resolution Selector for VLMs

arXiv:2510.19496v3 Announce Type: replace Abstract: Large vision-language models (VLMs) commonly process images at native or high resolution to remain effective across tasks. This inflates visual tokens ofter to 97-99% of total tokens, resulting in high compute and latency, even when low-resolution images would suffice. We introduce \emph{CARES}-a \textbf{C}ontext-\textbf{A}ware \textbf{R}esolution

arXiv CS 8d ago

Exploiting Semantic and Pixel Representations for Ultra-Low Bitrate Image Compression

new Abstract: Most existing extreme compression methods fail to achieve an optimal rate-distortion-perception trade-off, as they typically prioritize perceptual fidelity and visual realism over pixel-level accuracy. Consequently, the resulting reconstructions often deviate noticeably from the originals. Ultra-low bitrate image compression is therefore crucial-not only for producing extremely compact representations but also for ensuring that reconstructed images remain semantically coherent...

arXiv CS 8d ago

Gooey: A GPU-accelerated UI framework for Zig

A GPU-accelerated UI framework for Zig, targeting macOS (Metal), Linux (Vulkan/Wayland), and Browser (WASM/WebGPU). Join the Gooey discord Early Development: API is evolving. Example app built with Gooey — chat-zig, an Anthropic Claude client using the Zig 0.16 std.

Hacker News 7d ago

UniVerse: A Unified Modulation Framework for Segmentation-Free,Disentangled Multi-Concept Personalization

Announce Type: replace Abstract: Personalized visual understanding has advanced significantly, yet existing approaches struggle to localize and extract specific concepts when input images contain multiple objects. Many prior methods rely heavily on segmentation-based supervision or exhibit poor compositional generalization, limiting their ability to accurately disentangle and manipulate individual concepts. In this work, we propose UniVerse, a Unified Modulation Framework for...

arXiv CS 7d ago

Vision Inference Former: Sustaining Visual Consistency in Multimodal Large Language Models

arXiv:2605.18160v2 Announce Type: replace Abstract: In recent years, multimodal large language models (MLLMs) have achieved remarkable progress, primarily attributed to effective paradigms for integrating visual and textual information. The dominant connector-based paradigm projects visual features into textual sequence, enabling unified multimodal alignment and reasoning within a generative architecture. However, our experiments reveal two key limitations: (1) Although visual information...

arXiv CS 7d ago

PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers

Announce Type: replace Abstract: Building image classification models remains cumbersome in data-scarce domains, where collecting large labeled datasets is impractical. In-context learning (ICL) is a promising paradigm for few-shot image classification (FSIC), but prior work has underexplored the relative importance of encoder pretraining versus fusion-layer training data. We present PictSure, a vision-only ICL family of models that demonstrates the potential of easy-to-use fusion...

arXiv CS 9d ago