GenEval
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
1-Bit Bonsai Image 4B Image Generation for Local Devices
Introducing 1-bit and Ternary Bonsai Image 4B: Image Generation for Local Devices Today we’re releasing Bonsai Image 4B, a family of compact image-generation models designed to run high-quality diffusion inference on local hardware: from laptops to phones. Bonsai Image 4B comes in two variants: - 1-bit Bonsai Image 4B uses binary {−1, +1} transformer weights with an FP16 group-wise scaling factor, giving 1.125 effective bits per weight. It targets maximum compression and is the right fit...
OctoT2I: A Self-Evolving Agentic Text-to-Image Router
arXiv:2606.01803v1 Announce Type: new Abstract: The explosive growth of Text-to-Image (T2I) models, from large-scale versions to lightweight, real-time ones, now faces diminishing marginal returns from single-model scaling. Agentic T2I methods emerged to alleviate this bottleneck by using multiple models. However, existing agentic T2I methods suffer from three key challenges: reliance on expensive handcrafted priors or human annotations, rigid single-path decision mechanisms, and a neglect...
Channel-wise Vector Quantization
Announce Type: replace Abstract: We present Channel-wise Vector Quantization (CVQ), a novel image tokenization paradigm that replaces patch-wise tokens with channel-wise tokens. Unlike conventional vector quantization, which assigns a discrete token to each patch feature vector, CVQ quantizes each channel of the feature map. This formulation represents an image as discrete levels of visual details, rather than as a grid of spatial patches.
Lookahead Sample Reward Guidance for Test-Time Scaling of Diffusion Models
arXiv:2602.03211v2 Announce Type: replace Abstract: Diffusion models have demonstrated strong generative performance; however, generated samples often fail to fully align with human intent. This paper studies an efficient test-time scaling method for sampling from regions with higher human-aligned reward values. Existing methods for computing the expected future reward (EFR) face important limitations: backward rollout incurs prohibitively high sampling costs, while Tweedie-based approaches,...
Drifting Preference Optimization for One-Step Generative Models
arXiv:2606.02521v2 Announce Type: replace Abstract: One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward gradients, or test-time optimization. We propose Drifting Preference Optimization (DrPO), an online preference-finetuning method for deterministic one-step...
OmniGen-AR: AutoRegressive Any-to-Image Generation
arXiv:2606.09156v1 Announce Type: new Abstract: Autoregressive (AR) models have demonstrated strong potential in visual generation, offering superior performance with simple architectures and optimization objectives. However, existing methods are typically limited to single-modality conditions, e.g., text, restricting their applicability in real-world scenarios that demand image synthesis from diverse controls. In this work, we present OmniGen-AR, a unified autoregressive framework for...
Drifting Preference Optimization for One-Step Generative Models
arXiv:2606.02521v3 Announce Type: replace Abstract: One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward gradients, or test-time optimization. We propose Drifting Preference Optimization (DrPO), an online preference-finetuning method for deterministic one-step...
Optimizing Few-Step Generation with Adaptive Matching Distillation
arXiv:2602.07345v2 Announce Type: replace Abstract: Distribution Matching Distillation (DMD) is a powerful acceleration paradigm, yet its stability is often compromised in Forbidden Zone, regions where the real teacher provides unreliable guidance while the fake teacher exerts insufficient repulsive force. In this work, we propose a unified optimization framework that reinterprets prior art as implicit strategies to avoid these corrupted regions. Based on this insight, we introduce Adaptive...
Drifting Preference Optimization for One-Step Generative Models
Announce Type: new Abstract: One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward gradients, or test-time optimization. We propose Drifting Preference Optimization (DrPO), an online preference-finetuning method for deterministic one-step generators. For each prompt,...
CSFlow: Aligning Flow Matching with Human Contrast Sensitivity
arXiv:2606.08833v1 Announce Type: new Abstract: We introduce Contrast Sensitive Flow (CSFlow), a weighting scheme that connects the human eye's Contrast Sensitivity Function (CSF) to the iterative denoising steps of flow matching. Because real-world images concentrate signal at low spatial frequencies, these components reach high signal-to-noise ratio earlier during continuous diffusion than high-frequency components. When generating images with diffusion or flow matching models, this...