Education
Non-Splitting Coflow Scheduling with Provable Guarantees in Heterogeneous Parallel Networks
Key Points
arXiv:2501.09293v5 Announce Type: replace Abstract: As a prominent network abstraction, coflow models efficiently capture communication patterns in data centers. Since coflow scheduling in large-scale data centers is $\mathcal{NP}$-hard, the existing literature has predominantly focused on limited environments with $m=2$ network cores, relying on flow splitting, which introduces substantial operational overhead. Crucially, no approximation algorithm with provable performance guarantees has...
arXiv:2501.09293v5 Announce Type: replace
Abstract: As a prominent network abstraction, coflow models efficiently capture communication patterns in data centers. Since coflow scheduling in large-scale data centers is $\mathcal{NP}$-hard, the existing literature has predominantly focused on limited environments with $m=2$ network cores, relying on flow splitting, which introduces substantial operational overhead. Crucially, no approximation algorithm with provable performance guarantees has been proposed for the more practical, non-splitting coflow scheduling problem, even for the $m=2$ case, let alone for general hybrid architectures. To bridge this critical gap, this paper investigates the non-splitting problem within a hybrid, heterogeneous parallel network featuring multiple network cores ($m \ge 2$) composed of Electronic Packet Switches (EPS), not-all-stop Optical Circuit Switches (OCS), and all-stop OCS. We propose a unified polynomial-time approximation algorithm that minimizes the makespan across this hybrid environment without incurring any splitting overhead. Let $\tau$ denote the maximum flow degree across all ports in the network, $N$ be the number of input/output ports, and $m$ be the number of network cores. In pure EPS environments, the algorithm achieves an approximation guarantee of $\min\left\{\tau, 2Nm+1\right\}$. For pure not-all-stop and pure all-stop OCS environments, the guaranteed ratios are $2\min\left\{\tau, 2Nm+1\right\}$ and $2\min\left\{2\tau-1, 2Nm+\tau\right\}$, respectively. Notably, when specialized to the $m=2$ setting, our algorithm escapes network-scale dependencies, yielding constant bounds of $2$ and $4$ for pure EPS, and pure not-all-stop OCS, respectively, and $2\tau+2$ for pure all-stop OCS. By leveraging these constituent bounds, we prove that the overall performance guarantee in the hybrid architecture is upper-bounded by the least-performing switch architecture in the network.