Home › Knowledge Base › Common Crawl

Common Crawl

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

CS336: Language Modeling from Scratch

Course Staff Logistics - Lectures: Monday/Wednesday 3:00-4:20pm in Skilling Auditorium - Recordings: YouTube playlist - Office hours: - Percy Liang: Fridays 11am-12pm in Gates 366 - Tatsu Hashimoto: Tuesdays 11-12am in Gates 364 - Marcel Rød: Tuesdays 4:30-5:30pm in Gates 498, Wednesdays 4:30-5:30pm in Gates 415 - Herman Brunborg: Wednesdays 1:30-2:30pm, Fridays 1:30-2:30pm, location Gates 392 - Steven Cao: Mondays 4:30-5:30pm, Thursdays 9:30-10:30am, Gates 200 - Contact: Students should ask...

Hacker News 8d ago

Epistemic Injustice in Language Models: An Audit of Pretraining Filters and Guardrails

arXiv:2606.05936v1 Announce Type: new Abstract: Modern language models rely on pretraining filters to remove undesirable content from training corpora and inference-time guardrails to suppress undesirable outputs during deployment. In this paper, we examine how these filtering and moderation decisions produce forms of epistemic erasure and reveal tensions both across automated systems and between these systems and human judgment. We audit four pretraining filters and three inference-time...

arXiv CS 5d ago

IndoBias: A Dual Track Culturally Grounded Benchmark for LLMs Bias Evaluation in Indonesian Languages

arXiv:2606.01260v1 Announce Type: new Abstract: Despite being home to more than 1300 ethnic groups and 700 indigenous languages, bias in Large Language Models has not been fully studied in Indonesia, thus leaving a critical gap in evaluating representational fairness and localized stereotypes within its uniquely vast, multilingual, and diverse sociocultural landscape. To address this, we introduce IndoBias as a culturally-grounded bias benchmark to assess LLMs bias in Indonesian and three...

arXiv CS 8d ago

WAON: A Large-Scale Japanese Image-Text Dataset for Cultural Adaptation in Contrastive Vision-Language Models

arXiv:2510.22276v3 Announce Type: replace Abstract: Contrastive vision-language models have achieved remarkable progress through large-scale pretraining. Recent work has shown that removing English-only caption filters and pretraining on global data is effective for improving multicultural performance. We study whether such global pretraining is sufficient for culture-specific understanding, or whether further adaptation with natively sourced data can boost performance beyond what global...

arXiv CS 8d ago

FOLD: Fuzzy Online Deduplication for Very Large Evolving Datasets via Approximate Nearest Neighbor Search

Announce Type: new Abstract: Fuzzy deduplication is key to constructing large language model training corpora. However, classic Locality-Sensitive Hashing pipelines scale poorly as corpora grow and are ill-suited to continuous ingestion. We present FOLD (Fuzzy Online Deduplication), an online fuzzy deduplication system that delivers high recall and throughput for evolving datasets.

arXiv CS 7d ago

MiNI-Q: A Miniature, Wire-Free Quadruped with Unbounded, Independently Actuated Leg Joints

arXiv:2603.11537v2 Announce Type: replace Abstract: Physical joint limits are common in legged robots and can restrict workspace, constrain gait design, and increase the risk of hardware damage. This paper introduces MiNI-Q^2, a miniature, wire-free quadruped robot with independently actuated, mechanically unbounded 2-DOF leg joints. We present the mechanical design, kinematic analysis, and experimental validation of the proposed robot.

arXiv CS 8d ago

Port React Compiler to Rust

Hacker News 2h ago

Human-Like Neural Nets by Catapulting

Human-like Neural Nets by Catapulting Speculative proposal to create artificial neural nets with human-like performance by high-learning-rate/regularization training of overparameterized NNs to trigger catapulting/grokking. Over-parameterization as a route to true generalization would resolve many outstanding mysteries of artificial versus natural intelligence. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are...

Hacker News 3d ago

Book Dedications

To my sister, Dr. Soma Mohammed Mohammed Baroud. I write your name in full, because that is how it appeared on the white body bag that held your remains soon after the bomb was dropped. Dedications A random assortment of book dedications.

Hacker News 8d ago

The secret underground system keeping the Grand Canyon alive

The secret underground system keeping the Grand Canyon alive - Date: - June 2, 2026 - Source: - Northern Arizona University - Summary: - Scientists are venturing into the Grand Canyon’s hidden cave networks to solve a mystery: how snowmelt travels underground to supply the park’s vital springs. Their discoveries could help protect the canyon’s water from drought, contamination, and other growing threats. - Share: Every year, millions of people visiting Grand Canyon National Park stop at one...

Science Daily 7d ago