Home › Business & Finance › Information-Theoretic Bounds for Sparse Covariance...

Business & Finance

Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model

arXiv CS Monday 08 June 2026, 04:00 UTC By Jing Yee Tan, Guangyue Han 1 min read

Key Points

Announce Type: new Abstract: We study the minimax estimation error for distributed covariance matrix estimation in the vertical-split (feature-split) setting, where two agents each observe different coordinates of $m$ i.i.d. and communicate a limited number of bits to a central server. [2025] established nearly tight bounds for dense (unstructured) cross-covariance matrices, we investigate whether imposing elementwise $s$-sparsity on the cross-covariance $C_{21}$ can reduce the required...

arXiv:2606.07124v1 Announce Type: new Abstract: We study the minimax estimation error for distributed covariance matrix estimation in the vertical-split (feature-split) setting, where two agents each observe different coordinates of $m$ i.i.d. sub-Gaussian samples and communicate a limited number of bits to a central server. While Rahmani et al. [2025] established nearly tight bounds for dense (unstructured) cross-covariance matrices, we investigate whether imposing elementwise $s$-sparsity on the cross-covariance $C_{21}$ can reduce the required communication and sample complexity. In contrast to the horizontal-split setting, where Braverman et al. [2016] showed that sparsity does not reduce communication cost for mean estimation, we prove that sparsity does help for cross-covariance estimation in the vertical split. Specifically, we establish minimax lower bounds showing that the communication budget per agent scales as $B_k = \Omega(\sigma^4 d_k\, s' \log(d_1 d_2/s')/\varepsilon^2)$ and the sample complexity for cross-covariance estimation as $m = \Omega(\sigma^4\, s' \log(d_1 d_2/s')/\varepsilon^2)$, where $s' = s \wedge d_{\min}$. For the $1$-sparse case, this yields an exponential improvement from $d_1 d_2$ to $\log(d_1 d_2)$ compared to the dense rate. Our lower bounds are established via Fano's method with an explicit sparse packing using a Varshamov--Gilbert-type argument for signed partial permutation matrices combined with the Conditional Strong Data Processing Inequality of Rahmani et al. [2025]. We show the bounds are tight with a matching achievable scheme, based on covering-net quantization and entry-wise hard thresholding, that attains the $s$-sparse lower bound up to polylogarithmic factors.

Sparse Covariance Estimation (ORG) sub-Gaussian (ORG) Rahmani (PERSON) Braverman et al (PERSON) s')/\varepsilon^2)$ (ORG) d_2)$ (ORG) Fano (ORG) Varshamov--Gilbert (ORG) the Conditional Strong Data Processing (ORG)

Originally published by arXiv CS Read original →

Information-Theoretic Bounds for Sparse Covariance Estimation in the Vertical-Split Distributed Model

Related Stories

Unique SpaceX IPO is hedging challenge for Wall Street: 'What are you going to do, short NASA?'

Less props for prediction markets under proposal

Musk Stock Fans Say ‘The More, The Better’ in SpaceX IPO Frenzy

Investment Committee's Message: Don't Overfocus on SpaceX IPO