Large
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
arXiv:2502.01576v2 Announce Type: replace Abstract: Multi-modal Large Language Models (MLLMs) excel in vision-language tasks but remain vulnerable to visual adversarial perturbations that can induce hallucinations, manipulate responses, or bypass safety mechanisms. Existing methods seek to mitigate these risks by applying constrained adversarial fine-tuning to CLIP vision encoders on ImageNet-scale data, ensuring their generalization ability is preserved. However, this limited adversarial...
SurgiQ: A Large-Scale Multi-Domain Benchmark for Evaluating Surgical Understanding in Large Language Models
arXiv:2606.08071v1 Announce Type: new Abstract: Reliable evaluation of large language models in surgery remains underdeveloped. Broad medical benchmarks test clinical knowledge, while surgery requires procedural reasoning, management trade-offs, negation handling, and selection among plausible operative decisions. We present SurgiQ, a text-only, source-grounded benchmark of 13,055 four-option multiple-choice questions spanning six surgical domains and four question formats: case-based,...
RTL-BenchLS: A Large-Scale Benchmark for RTL Reasoning and Generation with Large Language Models
arXiv:2606.08976v1 Announce Type: new Abstract: LLM-based RTL generation and reasoning is a promising direction for hardware design automation. High-quality benchmarks are critical infrastructure for tracking progress in this direction. However, existing RTL benchmarks face inherent limitations in both scale and task scope.
The Refusal--Compliance Tradeoff: A Large-Scale Safety Behavior Audit of Large Language Models
arXiv:2605.05427v2 Announce Type: replace Abstract: Refusal rates are a poor proxy for LLM safety, i.e., a model may over-refuse benign prompts while still complying with harmful ones. We audit both failure modes across 21 open-weight LLMs on four safety benchmarks (OR-Bench, XSTest, ToxiGen, BOLD), using a composition adjustment to isolate model sensitivity from dataset toxicity confounds. We report three findings.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v3 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
LiMuon: Light and Fast Muon Optimizer for Large Models
arXiv:2509.14562v4 Announce Type: replace Abstract: Large models recently are widely applied in machine learning, so efficient training of large models has received widespread attention. More recently, the useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to study the Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models.
Human-Alignment, Calibration, and Activation Patterns in Large Language Model Uncertainty
Announce Type: new Abstract: Uncertainty Quantification is a large and growing subfield of large language model behavioral analysis. Primarily to recognize and combat hallucination, the field has largely focused on measuring and improving calibration, the accuracy of uncertainty judgments to task efficacy.
Unified sparse framework for large-scale material point method simulations
Announce Type: replace Abstract: The material point method (MPM) is a hybrid particle-grid method widely used for simulating large deformation with history-dependent behavior. Standard MPM often relies on a dense background grid, which can be highly inefficient when material occupies a small fraction of the computational domain. Such sparsity is common in many large-scale problems, from geophysical mass flows over large terrain domains to visual-computing applications.
Unified sparse framework for large-scale material point method simulations
Announce Type: replace-cross Abstract: The material point method (MPM) is a hybrid particle-grid method widely used for simulating large deformation with history-dependent behavior. Standard MPM often relies on a dense background grid, which can be highly inefficient when material occupies a small fraction of the computational domain. Such sparsity is common in many large-scale problems, from geophysical mass flows over large terrain domains to visual-computing applications.
Rotary GPU: Exploring Local Execution for Large MoE Models Under Limited VRAM
Performance [Submitted on 27 May 2026] Title:Rotary GPU: Exploring Local Execution Paths for Large Mixture-of-Experts Models Under Limited GPU Memory View PDF HTML (experimental)Abstract:Large language models have achieved remarkable capabilities through scaling, and this paper does not challenge that. It instead investigates a different question: once large models already exist, can they become more accessible to environments with substantially smaller hardware resources?