Home Knowledge Base Load-Balanced Communication-Efficient Context Parallelism

Load-Balanced Communication-Efficient Context Parallelism

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

FlashCP: Load-Balanced Communication-Efficient Context Parallelism for LLM Training

Announce Type: new Abstract: Context parallelism (CP) is essential for training large-scale, long-context language models, as it partitions sequences to reduce memory overhead. However, existing CP methods suffer from workload imbalance, inefficient kernels, and redundant communication due to static sequence sharding and key-value (KV) tensor communication. We present FlashCP, a load-balanced and communication-efficient framework for CP training.

arXiv CS 1d ago