Science
Anthocyanin-associated cellular programs underlying terroir variation in Cabernet Sauvignon grape berry revealed by SEED-based deconvolution
Key Points
Plant tissues consist of diverse cell populations that collectively contribute to development, metabolism, environmental responses, and phenotype formation. Although single-cell and single-nucleus RNA sequencing have greatly advanced the study of plant cellular heterogeneity, their application to large sample cohorts remains limited by cost, technical complexity, tissue dissociation constraints, and throughput. In contrast, bulk RNA-seq datasets have accumulated extensively across plant...
Plant tissues consist of diverse cell populations that collectively contribute to development, metabolism, environmental responses, and phenotype formation. Although single-cell and single-nucleus RNA sequencing have greatly advanced the study of plant cellular heterogeneity, their application to large sample cohorts remains limited by cost, technical complexity, tissue dissociation constraints, and throughput. In contrast, bulk RNA-seq datasets have accumulated extensively across plant species, tissues, developmental stages, and environmental conditions, yet the celltype-level information embedded in these datasets remains difficult to resolve because plant-oriented deconvolution frameworks are still lacking. Existing deconvolution methods have largely been developed in mammalian systems and have not been systematically optimized for plant transcriptomic features, leaving their applicability under plant-specific constraints unclear. Here, we present SEED, an adaptive deconvolution framework optimized for plant transcriptomic data. SEED integrates candidate reference-template construction with seven deconvolution strategies and automatically identifies an optimal combination for a given dataset. In grapevine simulated benchmarking, SEED showed its clearest advantage under low-replication conditions and remained broadly competitive, rather than uniformly dominant, when larger pseudo-bulk sample sizes were evaluated. SEED further performed robustly in public Arabidopsis thaliana and Nicotiana tabacum datasets. Finally, we applied SEED to bulk RNA-seq data generated in this study from Vitis vinifera cv. Cabernet Sauvignon berries collected from Yinchuan and Yantai, identifying terroir-associated cell subtypes and coordinated celltype interaction patterns. Together, these results establish SEED as a practical framework for plant transcriptome deconvolution and provide a new tool for dissecting cellular heterogeneity associated with environmental adaptation and phenotype formation in plants.