Home › Knowledge Base › 4B

4B

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

1-Bit Bonsai Image 4B Image Generation for Local Devices

Introducing 1-bit and Ternary Bonsai Image 4B: Image Generation for Local Devices Today we’re releasing Bonsai Image 4B, a family of compact image-generation models designed to run high-quality diffusion inference on local hardware: from laptops to phones. Bonsai Image 4B comes in two variants: - 1-bit Bonsai Image 4B uses binary {−1, +1} transformer weights with an FP16 group-wise scaling factor, giving 1.125 effective bits per weight. It targets maximum compression and is the right fit...

Hacker News 10d ago

Albania Is Not for Sale: Kushner's $4B Resort Triggers'Flamingo Revolution'

Article URL: https://www.yacnews.com/albania-is-not-for-sale-kushners-4-billion-resort-triggers-flamingo-revolution-asset-freeze-and-an-eu-warning/ Comments URL: https://news.ycombinator.com/item?id=48461012 Points: 13 # Comments: 0

Hacker News 1d ago

Mellum2 Technical Report

arXiv:2605.31268v1 Announce Type: new Abstract: We present Mellum 2, an open-weight 12B-parameter Mixture-of-Experts (MoE) language model with 2.5B active parameters per token. Mellum 2 is a general-purpose language model specialized in software engineering, spanning code generation and editing, debugging, multi-step reasoning, tool use and function calling, agentic coding, and conversational programming assistance, and it is the successor to the completion-focused 4B dense Mellum model. The...

arXiv CS 9d ago

From "Weak" Signals to Strong Models: Preference Delta Aggregation with LoRA Merging

arXiv:2606.00357v2 Announce Type: replace Abstract: Training strong large language models (LLMs) requires high-quality supervision, which is often scarce. Recent work shows that paired preference data from weak-weaker model pairs (e.g., Qwen3 4B over 1.7B), despite the limited quality of individual responses, can provide an effective supervision signal through relative quality deltas, which we term a "weak" signal. This motivates a key research question: can multiple "weak" signals be...

arXiv CS 2d ago

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

arXiv:2605.31354v1 Announce Type: new Abstract: Modular visual reasoning systems increasingly rely on shared working memory for multi-step collaboration, yet the failure dynamics of intermediate state evolution in low-capacity regimes remain underexplored. We study failure modes of collaborative reasoning with weak learners (4B--8B models) through the lens of noise accumulation. We introduce CoSee, an auditing framework that formalizes the read-write-verify loop to trace information flow in...

arXiv CS 9d ago

Temporal Preference Concepts and their Functions in a Large Language Model

Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being deployed to make decisions that require trading off near-term gains against long-term consequences, yet little is known about how they internally represent or resolve these tradeoffs. In this work, we causally localize an underlying subgraph for temporal preference in a distilled LLM (Qwen3-4B-Instruct-2507), identifying mid-to-upper-layer nodes through converging evidence from gradient-based attribution and...

arXiv CS 5d ago

Wall-OSS-0.5 Technical Report

Announce Type: new Abstract: Large-scale Vision-Language-Action (VLA) pretraining is increasingly adopted as the foundation for robot policies, yet the evidence for pretrained VLAs is almost invariably reported after task-specific fine-tuning. This leaves a foundational question unanswered: does VLA pretraining itself yield executable robot behavior, or does it merely furnish a better initialization for downstream policy learning? We present Wall-OSS-0.5, an open-source 4B VLA built upon a...

arXiv CS 9d ago

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

arXiv:2606.08051v1 Announce Type: new Abstract: Financial transaction processing requires extracting structured merchant information from noisy, abbreviated bank transaction strings at scale. Our current production system, a LoRA-fine-tuned LLaMA 3.1-8B, achieves 96.95% F1 on this task, but deploying 8-billion-parameter models imposes prohibitive memory, latency, and cost constraints. To identify more efficient alternatives, we conduct a deployment-focused study of 24 model variants spanning...

arXiv CS 1d ago

Molecular glue degraders of HuR suppress BRAF-mutant colorectal cancer

Abstract BRAF gain-of-function mutations, particularly BRAF(V600E), affect roughly 10% of all patients with colorectal cancer (CRC), and portend poor prognosis with limited therapeutic interventions. BRAF inhibitors such as encorafenib are ineffective due to MAPK pathway reactivation driven by BRAF dimerization. Combined inhibition of BRAF and EGFR, although approved therapies, results in short survival benefits and frequent treatment resistance and relapse1,2,3.

Nature 19h ago

Process Reward Agents for Steering Knowledge-Intensive Reasoning

arXiv:2604.09482v2 Announce Type: replace Abstract: Reasoning in knowledge-intensive domains remains challenging as intermediate steps are often not locally verifiable: unlike math or code, evaluating step correctness may require synthesizing clues across large external knowledge sources. As a result, subtle errors can propagate through reasoning traces, potentially never to be detected. Prior work has proposed process reward models (PRMs), including retrieval-augmented variants, but these...

arXiv CS 8d ago