Health
DeltaMut: An Integrative Database of AlphaFold2-Derived Missense Variant Structures
Key Points
The widespread use of next-generation sequencing has led to a surge in the number of identified variants with uncertain effects on protein function. These variants pose a significant challenge in diagnostics and hinder patient treatment strategies. Numerous variant effect predictors (VEPs) are available to assess variant impact, but they primarily rely on sequence-derived information.
The widespread use of next-generation sequencing has led to a surge in the number of identified variants with uncertain effects on protein function. These variants pose a significant challenge in diagnostics and hinder patient treatment strategies. Numerous variant effect predictors (VEPs) are available to assess variant impact, but they primarily rely on sequence-derived information. The recent development of AlphaFold2 has raised questions about whether information retrieved from wild-type or predicted structures of missense variants can improve the predictive power of these algorithms. While the AlphaFold Protein Structure Database serves as a valuable resource for wild-type protein structures, a large-scale collection of missense variant structures is not available, limiting current efforts to wild-type conformations and a handful of modeled variants. To address this limitation, we developed DeltaMut, a comprehensive database containing over 77,000 protein structures, including 65,000 pathogenic and neutral missense variants. All structural models were generated using ParaFold, a high-performance computing-optimized implementation of AlphaFold2. The large-scale and systematic generation of variant protein structures distinguish DeltaMut as a unique resource for both expansive statistical studies and detailed, case-specific investigations of variant-induced structural changes. Furthermore, the DeltaMut database is freely accessible without registration.