Home Knowledge Base Compute and Communication Co-design

Compute and Communication Co-design

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

CUCo: An Agentic Framework for Compute and Communication Co-design

Announce Type: replace Abstract: Computation and communication in distributed LLM training and inference are traditionally optimized in isolation; expert-crafted systems such as DeepEP, FLUX, and TokenWeave show the potential of co-design but require deep systems expertise and hardware-specific tuning; CUCo is an agentic framework that automates compute-communication co-design of CUDA kernels by combining a structured design-space formalization with a correctness-first fast-path agent for...

arXiv CS 5d ago

Building user-driven climate adaptation products

Abstract Climate adaptation products have traditionally been developed using a supply-driven model reliant on available climate information, leading to usability gaps1,2,3,4. To better meet user needs, the climate services field has recognized a need to shift towards a demand-driven model emphasizing co-production, that is, user-driven, scientifically informed products created through shared knowledge practices1,2,3,4,5. However, co-production can be challenging, especially for researchers...

Nature 23h ago

An Asynchronous Two-Speed Kalman Filter for Real-Time UUV Cooperative Navigation Under Acoustic Delays

arXiv:2604.02878v2 Announce Type: replace Abstract: In Global Navigation Satellite System (GNSS)-denied underwater environments, individual unmanned underwater vehicles (UUVs) suffer from unbounded dead-reckoning drift, making collaborative navigation (CN) crucial for accurate state estimation. However, the severe communication delay inherent in underwater acoustic channels poses serious challenges to real-time state estimation. Traditional filters, such as Extended Kalman Filters (EKFs) or...

arXiv CS 8d ago

In-Memory Computing Enabled Deep MIMO Detection to Support Ultra-Low-Latency Communications

arXiv:2508.17820v2 Announce Type: replace Abstract: The development of sixth-generation (6G) mobile networks imposes unprecedented latency and reliability demands on multiple-input multiple-output (MIMO) communication systems, a key enabler of high-speed radio access. Recently, deep unfolding-based detectors, which map iterative algorithms onto neural network architectures, have emerged as a promising approach, combining the strengths of model-driven and data-driven methods to achieve high...

arXiv CS 8d ago

HE^2: A Communication-Light Heterogeneous Architecture for Efficient Fully Homomorphic Encryption

arXiv:2605.31004v1 Announce Type: new Abstract: CKKS, an emerging fully homomorphic encryption (FHE) scheme, has been promising in privacy-preserving applications by enabling SIMD fixed-point computations on ciphertexts. Despite its strong security guarantees, CKKS involves both compute-intensive operators (ComOps) with high computational cost and memory-intensive operators (MemOps) with large memory footprints, making existing ASIC-based or NMP-based acceleration approaches suffer from high...

arXiv CS 9d ago

Microsoft’s AI chief says superintelligence is near, but won’t take your job

Today I’m talking with Mustafa Suleyman, the CEO of Microsoft AI. And I’m actually going to keep today’s intro short — I’m working from my wife’s family farm this week, as you’ll see in the video, but also this is a real burner of an episode. We covered everything from Mustafa’s approach to training new models to his criticisms of Anthropic talking about Claude as though it is conscious.

The Verge 2d ago

UltraEP: Unleash MoE Training and Inference on Rack-Scale Nodes with Near-Optimal Load Balancing

arXiv:2606.04101v1 Announce Type: new Abstract: Large-scale expert parallelism (EP) is becoming pivotal for training and serving frontier MoE models, but it also amplifies device-level expert load imbalance into compute stragglers, token all-to-all bottlenecks, and activation-memory spikes. Existing balancers redistribute experts periodically based on historical load, which becomes unreliable for production deployments with non-stationary load patterns. We present UltraEP, the first...

arXiv CS 6d ago

UltraEP: Unleash MoE Training and Inference on Rack-Scale Nodes with Near-Optimal Load Balancing

arXiv:2606.04101v2 Announce Type: replace Abstract: Large-scale expert parallelism (EP) is becoming pivotal for training and serving frontier MoE models, but it also amplifies device-level expert load imbalance into compute stragglers, token all-to-all bottlenecks, and activation-memory spikes. Existing balancers redistribute experts periodically based on historical load, which becomes unreliable for production deployments with non-stationary load patterns. We present UltraEP, the first...

arXiv CS 2d ago

Ahoy, DECmate II the little PDP-8 that could

Now, that's a lot of word processing. But under the hood it's still at least PDP-8 adjacent, even considering its oddities and incompatibilities, and you can make it do many of the things a full-size Eight can. We'll take this basic unit, convert the floppy drives to solid state, tap the video output, and put it through its paces.

Hacker News 10d ago

Nvidia's entrance into the PC market gives investors another reason to own the stock

Nvidia has added another leg to its investment case, planted far away from the data center. It's on your desk at the office and at home. At the influential Computex conference in Taiwan, CEO Jensen Huang focused the first half of his keynote address on the data center and the wonders of Nvidia's Vera computing platform for agentic AI workloads.

CNBC 9d ago