Home Knowledge Base Pipeline Asynchronous

Pipeline Asynchronous

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Breaking the Bubble: Asynchronous Pipeline Parallel Training with Bounded Weight Inconsistency

arXiv:2606.07881v1 Announce Type: new Abstract: Pipeline parallelism is essential for training large neural networks, but existing schedules trade off throughput, memory, and optimization consistency. Synchronous pipelines preserve forward/backward weight consistency but suffer from bubbles; asynchronous pipelines remove bubbles but introduce weight-version mismatch, typically requiring weight stashing, prediction, or correction mechanisms. We introduce PACI (Pipeline Asynchronous training...

arXiv CS 1d ago

SECUREVENT: Hybrid AI/ML Security Monitoring for Distributed Event-Based Systems

Announce Type: new Abstract: Distributed event-based systems have become a common substrate for Internet-scale publish/subscribe services, IoT telemetry, cloud-native microservices, and security operations pipelines. Their loose coupling and asynchronous delivery improve scalability, but they also expand the attack surface: publishers, brokers, subscribers, topics, schemas, and temporal ordering can each be abused without a single component observing the whole behavior. This paper proposes...

arXiv CS 8d ago

Computation-Aware Event-to-Frame Reconstruction via Selective Attention

arXiv:2606.06142v1 Announce Type: new Abstract: Event-to-frame (E2F) reconstruction bridges asynchronous event streams with frame-based vision pipelines, but existing methods often face a trade-off between reconstruction quality and computational efficiency. In this work, we propose an efficient E2F framework that emphasizes causal temporal modeling and computation-aware design. The architecture adopts a recurrent encoder-decoder to incrementally aggregate event information with compact...

arXiv CS 5d ago

AXLE: Coordinated Offloading with Asynchronous Back-Streaming in Computational Memory Systems

arXiv:2512.04449v2 Announce Type: replace Abstract: CXL-based Computational Memory (CCM) enables near-memory processing within expanded remote memory, offering opportunities to address data movement costs in disaggregated memory systems and to accelerate overall performance. However, existing offloading mechanisms do not fully leverage the trade-offs of different offload models based on different CXL protocols. This work first examines these tradeoffs and their impact on end-to-end...

arXiv CS 8d ago

CoMem: Context Management with A Decoupled Long-Context Model

arXiv:2605.30842v1 Announce Type: new Abstract: Context management enables agentic models to solve long-horizon tasks through iterative summarization of previous interaction histories. However, this process typically incurs substantial decoding overhead for the extra summarization tokens, which significantly affect the end-to-end response latency at deployment.

arXiv CS 9d ago

DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference

Announce Type: replace Abstract: Video and image streaming on edge devices requires low latency. To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units such as Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs), and Deep Learning Processing Units (DPUs). However, further reductions in latency can be observed by combining these units.

arXiv CS 5d ago

TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments

arXiv:2602.02459v2 Announce Type: replace Abstract: Robots in dynamic, human-centric environments must follow language instructions while maintaining real-time reactive control. Vision-language-action (VLA) models offer a promising framework, but they assume temporally aligned reasoning and control, despite semantic inference being inherently delayed relative to real-time action. We introduce Think-in-Control (TIC)-VLA, a latency-aware framework that explicitly models delayed semantic...

arXiv CS 9d ago

RunAgent SuperBrowser: A Theory of Autonomous Web Navigation Grounded in Human Browsing Behaviour

arXiv:2606.09399v1 Announce Type: new Abstract: We present SUPERBROWSER, an autonomous web-navigation agent designed against a single guiding hypothesis: a web agent should browse the way a person browses. A human reading a page does not retain every pixel they have seen; they look at a few candidate targets, decide on one, and remember only what is needed to keep the goal alive. We operationalize this perception-cognition-action triad as three coupled mechanisms.

arXiv CS 1d ago

AsyncWebRL: Efficient Multi-Step RL for Visual Web Agents

arXiv:2606.05597v1 Announce Type: new Abstract: Training vision-language web agents with multi-step RL is compute-intensive, with two dominant forms of inefficiency: idle GPUs in synchronous RL, and trajectories that use more steps and tokens than necessary. We present AsyncWebRL, which addresses both.

arXiv CS 5d ago

Reducing the GPU Memory Bottleneck with Lossless Compression for ML -- Extended

arXiv:2605.30728v1 Announce Type: new Abstract: Machine learning (ML) training and inference often process data sets far exceeding GPU memory capacity, forcing them to rely on PCIe for on-demand tensor transfers, causing critical transfer bottlenecks. Lossy compression has been proposed to relieve bottlenecks but introduces workload-dependent accuracy loss, making it complex or even prohibitive to use in existing ML deployments.

arXiv CS 9d ago