Mol. Cells 2023; 46(2): 71-73
Published online February 28, 2023
https://doi.org/10.14348/molcells.2023.2197
© The Korean Society for Molecular and Cellular Biology
Correspondence to : hksong@korea.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
In 2020, AlphaFold2 protein structure prediction was presented at the 14th meeting of the Critical Assessment of Structure Prediction (CASP14) and fundamentally changed the structural biology field (Callaway, 2020; Jumper et al., 2021). AlphaFold2’s accurate performance has greatly impacted structural biology and medicine research. The AlphaFold2 neural network algorithm is based on deep learning processes that use prior knowledge of protein structures and multiple sequence alignments to accurately predict protein structures from the sequence data (Jumper et al., 2021). The deep learning system enabled the establishment of an AlphaFold database larger than the Protein Data Bank, which contains protein structures obtained using experimental techniques such as X-ray crystallography, cryogenic electron microscopy, and nuclear magnetic resonance spectroscopy. DeepMind and EMBL-EBI have made over 200 million predicted protein structures publicly available (https://alphafold.ebi.ac.uk) by including them in the UniProt database. Protein structures obtained using experimental techniques have historically been preferred over
In the view of experimental structural biologists, the answer is no, as AlphaFold2 predictions still have limitations (Callaway, 2022). An important limitation is that although AlphaFold2 can accurately predict static protein structures, it cannot predict dynamic states, which are physiologically relevant. The native structures of many proteins are metastable and not necessarily the most energetically stable structures (Ghosh and Ranjan, 2020). For example, kinases fold into inactive conformations without signals but change into active conformations in response to signals to perform their roles in signal transduction (Huse and Kuriyan, 2002). Serpin protease inhibitors fold into different conformational states in cells, namely cleaved, active, and latent conformations (Ghosh and Ranjan, 2020). Depending on the reaction steps, the proteolytic chamber of the self-compartmentalizing caseinolytic protease shows conformational diversity (Kim et al., 2022). Depending on cellular conditions, such as pH, an autophagic receptor formed a filamentous assembly (Kwon et al., 2018), and cytoskeletal proteins were found to polymerize and depolymerize dynamically (Goodson and Jonasson, 2018). Generally, multiprotein complexes (including fibrous assemblies) are not accurately predicted by AlphaFold2. An important limitation is that the structures of protein complexes with nonprotein ligands (e.g., small compounds with therapeutic potential or binding partners, such as DNA or RNA) are not well predicted by AlphaFold2, which was developed and trained for protein structure determination but not for docking (Jumper et al., 2021). Furthermore, protein modifications such as phosphorylation, glycosylation, lipidation, acetylation, and methylation affect the accuracy of AlphaFold2 predictions (Callaway, 2023). Like experimental techniques, the AlphaFold2 structures of intrinsically disordered proteins are mostly inaccurate. Furthermore, a single-point mutation can drastically affect protein folding, but the AlphaFold2 predicted structures of the mutated and wild-type proteins are often very similar. To overcome these limitations, numerous approaches are ongoing (Varadi and Velankar, 2022) and have been introduced in the recent CASP15 (Callaway, 2023). These efforts continuously strengthen artificial intelligence-based structure predictions. Perhaps one day, sequence-to-structure prediction will be straightforward for all situations, including all the components in entire organelles or even cells. However, given the complexity of biology, this may be some time away.
Nonetheless, the relatively accurate AlphaFold2 predictions have changed structural biology and related fields in a positive way. For initial model building with low-resolution experimental data from X-ray crystallography or cryogenic electron microscopy, AlphaFold2 can be greatly advantageous. Many successful examples of using AlphaFold2 models to obtain the phases of X-ray data with molecular replacement, model building for uninterpretable electron density, and biochemical experiments based on AlphaFold2 models (Cramer, 2021; Kleywegt and Velankar, 2022) are available. Although the accuracy of AlphaFold2 is not yet high enough for docking studies, rational drug design based on AlphaFold2’s predicted models will likely expand enormously in the near future (Varadi and Velankar, 2022). To fully understand biology, various hybrid and integrated approaches are inevitable, and therefore experimental structural biologists can focus on other biochemical, biophysical, and cell biology experiments. Experimental structural biologists will still be needed to validate the structure of uncertainly predicted models. Artificial intelligence-based structural biology is one of the biggest trends, and experimental structural biologists may have to tread a thorny path. This situation reminded me of a scene from the Apocrypha Acts of Peter.
Peter:
When Peter asked Jesus where he was going, Jesus replied that he was going to Rome to be crucified again. Of course, it is not directly comparable, but the notion is that structure determination using experimental techniques is extremely difficult compared with computational prediction. It is my view that experimental and prediction methods will continue to coevolve to answer complex biological questions.
This work is supported by grants from the National Research Foundation of Korea (2020R1A2C3008285 and 2021M3A9I4030068).
The author has no potential conflicts of interest to disclose.
Mol. Cells 2023; 46(2): 71-73
Published online February 28, 2023 https://doi.org/10.14348/molcells.2023.2197
Copyright © The Korean Society for Molecular and Cellular Biology.
Department of Life Sciences, Korea University, Seoul 02841, Korea
Correspondence to:hksong@korea.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
In 2020, AlphaFold2 protein structure prediction was presented at the 14th meeting of the Critical Assessment of Structure Prediction (CASP14) and fundamentally changed the structural biology field (Callaway, 2020; Jumper et al., 2021). AlphaFold2’s accurate performance has greatly impacted structural biology and medicine research. The AlphaFold2 neural network algorithm is based on deep learning processes that use prior knowledge of protein structures and multiple sequence alignments to accurately predict protein structures from the sequence data (Jumper et al., 2021). The deep learning system enabled the establishment of an AlphaFold database larger than the Protein Data Bank, which contains protein structures obtained using experimental techniques such as X-ray crystallography, cryogenic electron microscopy, and nuclear magnetic resonance spectroscopy. DeepMind and EMBL-EBI have made over 200 million predicted protein structures publicly available (https://alphafold.ebi.ac.uk) by including them in the UniProt database. Protein structures obtained using experimental techniques have historically been preferred over
In the view of experimental structural biologists, the answer is no, as AlphaFold2 predictions still have limitations (Callaway, 2022). An important limitation is that although AlphaFold2 can accurately predict static protein structures, it cannot predict dynamic states, which are physiologically relevant. The native structures of many proteins are metastable and not necessarily the most energetically stable structures (Ghosh and Ranjan, 2020). For example, kinases fold into inactive conformations without signals but change into active conformations in response to signals to perform their roles in signal transduction (Huse and Kuriyan, 2002). Serpin protease inhibitors fold into different conformational states in cells, namely cleaved, active, and latent conformations (Ghosh and Ranjan, 2020). Depending on the reaction steps, the proteolytic chamber of the self-compartmentalizing caseinolytic protease shows conformational diversity (Kim et al., 2022). Depending on cellular conditions, such as pH, an autophagic receptor formed a filamentous assembly (Kwon et al., 2018), and cytoskeletal proteins were found to polymerize and depolymerize dynamically (Goodson and Jonasson, 2018). Generally, multiprotein complexes (including fibrous assemblies) are not accurately predicted by AlphaFold2. An important limitation is that the structures of protein complexes with nonprotein ligands (e.g., small compounds with therapeutic potential or binding partners, such as DNA or RNA) are not well predicted by AlphaFold2, which was developed and trained for protein structure determination but not for docking (Jumper et al., 2021). Furthermore, protein modifications such as phosphorylation, glycosylation, lipidation, acetylation, and methylation affect the accuracy of AlphaFold2 predictions (Callaway, 2023). Like experimental techniques, the AlphaFold2 structures of intrinsically disordered proteins are mostly inaccurate. Furthermore, a single-point mutation can drastically affect protein folding, but the AlphaFold2 predicted structures of the mutated and wild-type proteins are often very similar. To overcome these limitations, numerous approaches are ongoing (Varadi and Velankar, 2022) and have been introduced in the recent CASP15 (Callaway, 2023). These efforts continuously strengthen artificial intelligence-based structure predictions. Perhaps one day, sequence-to-structure prediction will be straightforward for all situations, including all the components in entire organelles or even cells. However, given the complexity of biology, this may be some time away.
Nonetheless, the relatively accurate AlphaFold2 predictions have changed structural biology and related fields in a positive way. For initial model building with low-resolution experimental data from X-ray crystallography or cryogenic electron microscopy, AlphaFold2 can be greatly advantageous. Many successful examples of using AlphaFold2 models to obtain the phases of X-ray data with molecular replacement, model building for uninterpretable electron density, and biochemical experiments based on AlphaFold2 models (Cramer, 2021; Kleywegt and Velankar, 2022) are available. Although the accuracy of AlphaFold2 is not yet high enough for docking studies, rational drug design based on AlphaFold2’s predicted models will likely expand enormously in the near future (Varadi and Velankar, 2022). To fully understand biology, various hybrid and integrated approaches are inevitable, and therefore experimental structural biologists can focus on other biochemical, biophysical, and cell biology experiments. Experimental structural biologists will still be needed to validate the structure of uncertainly predicted models. Artificial intelligence-based structural biology is one of the biggest trends, and experimental structural biologists may have to tread a thorny path. This situation reminded me of a scene from the Apocrypha Acts of Peter.
Peter:
When Peter asked Jesus where he was going, Jesus replied that he was going to Rome to be crucified again. Of course, it is not directly comparable, but the notion is that structure determination using experimental techniques is extremely difficult compared with computational prediction. It is my view that experimental and prediction methods will continue to coevolve to answer complex biological questions.
This work is supported by grants from the National Research Foundation of Korea (2020R1A2C3008285 and 2021M3A9I4030068).
The author has no potential conflicts of interest to disclose.