Education
Cellpin enables reference-based imputation and denoising of spatial transcriptomes
Key Points
Spatially resolved transcriptomics enables gene expression profiling within tissue architecture, but targeted panels leave much of the transcriptome unmeasured and spatial artifacts such as RNA diffusion and segmentation errors introduce technical noise. These limitations necessitate computational imputation and denoising, yet existing methods typically incorporate spatial measurements during training, limiting scalability and risking the embedding of technology-specific artifacts into...
Spatially resolved transcriptomics enables gene expression profiling within tissue architecture, but targeted panels leave much of the transcriptome unmeasured and spatial artifacts such as RNA diffusion and segmentation errors introduce technical noise. These limitations necessitate computational imputation and denoising, yet existing methods typically incorporate spatial measurements during training, limiting scalability and risking the embedding of technology-specific artifacts into learned representations. To address this, we present cellpin, a variational autoencoder trained exclusively on single-cell RNA sequencing data, using teacher-student latent distillation and noise-simulating augmentations to jointly impute unmeasured genes and denoise spatial profiles without requiring cross-modality alignment. Benchmarked against six methods across multiple paired datasets, cellpin achieves superior held-out gene prediction while scaling efficiently to atlas-size references and multi-sample cohorts. In full-transcriptome Atera data, cellpin reduces residual spatial noise and improves cell-state resolution, providing a scalable and principled foundation for biological discovery from spatial transcriptomics data.