Mol. Cells 2022; 45(7): 444-453
Published online July 31, 2022
https://doi.org/10.14348/molcells.2022.0035
© The Korean Society for Molecular and Cellular Biology
Correspondence to : junseockkoh@snu.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Multivalent macromolecular interactions underlie dynamic regulation of diverse biological processes in ever-changing cellular states. These interactions often involve binding of multiple proteins to a linear lattice including intrinsically disordered proteins and the chromosomal DNA with many repeating recognition motifs. Quantitative understanding of such multivalent interactions on a linear lattice is crucial for exploring their unique regulatory potentials in the cellular processes. In this review, the distinctive molecular features of the linear lattice system are first discussed with a particular focus on the overlapping nature of potential protein binding sites within a lattice. Then, we introduce two general quantitative frameworks, combinatorial and conditional probability models, dealing with the overlap problem and relating the binding parameters to the experimentally measurable properties of the linear lattice-protein interactions. To this end, we present two specific examples where the quantitative models have been applied and further extended to provide biological insights into specific cellular processes. In the first case, the conditional probability model was extended to highlight the significant impact of nonspecific binding of transcription factors to the chromosomal DNA on gene-specific transcriptional activities. The second case presents the recently developed combinatorial models to unravel the complex organization of target protein binding sites within an intrinsically disordered region (IDR) of a nucleoporin. In particular, these models have suggested a unique function of IDRs as a molecular switch coupling distinct cellular processes. The quantitative models reviewed here are envisioned to further advance for dissection and functional studies of more complex systems including phase-separated biomolecular condensates.
Keywords: biological linear lattice, combinatorial model, conditional probability model, multivalent binding, overlapping binding site
Recent advances in cutting-edge biotechnologies have provided opportunities to observe unprecedented molecular details of various biological processes (Ha et al., 2022; Mahamid et al., 2016; Oikonomou and Jensen, 2017; Sigal et al., 2018). Interpretation of such observations requires quantitative models dissecting the underlying macromolecular interactions. In turn, the quantitative information allows further understanding and prediction of spatiotemporal regulation of specific cellular processes in dynamically changing environments. The complexity of macromolecular interactions ranges from simple 1:1 binding to formation of phase-separated condensates with multivalent binding among two or more components (Banani et al., 2017; Lyon et al., 2021; Shin and Brangwynne, 2017). In contrast to the 1:1 binding, multivalent interactions are difficult to describe with the simple mass action law but modeled with more sophisticated frameworks accounting for the presence of various molecular states (Bujalowski, 2006; Freire et al., 2009; Wyman and Gill, 1990). Furthermore, the quantitative models are often formulated with large numbers of parameters, and exemplary cases determining these parameters with suitable
A linear or one-dimensional lattice is a relatively tractable multivalent system found in numerous cellular processes. Linear lattices present multiple binding motifs or domains to interact with diverse proteins or multiple copies of identical proteins (Fig. 1) (Cortese et al., 2008; Dunker et al., 2005; Fung et al., 2018). For instance, in many signaling pathways, scaffold proteins such as axin, BRCA1, and Ste5 recruit various target proteins via specific binding sites (Choi et al., 1994; Mark et al., 2005; Wodarz and Nusse, 1998). These scaffold-driven higher-order assemblies are predicted to colocalize and increase the local concentrations of the target proteins and thereby facilitate their interactions for efficient integration and propagation of diverse signals in the cell (Fig. 1A) (Noutsou et al., 2011; Xue et al., 2013). Another example is the intrinsically disordered regions (IDRs) of some nucleoporins (Nups) present in the nuclear pore complex (NPC) (Fig. 1B) (Frey and Gorlich, 2007; Radu et al., 1995). The Nup IDRs mediate massive yet selective molecular transport between the nucleus and cytoplasm through specific interactions with karyopherin (Kap) proteins carrying macromolecular cargos (Koh and Blobel, 2015; Schoch et al., 2012). These interactions are achieved by multiple interspersed phenylalanine-glycine (FG) motifs on an IDR capturing several Kap molecules (Bayliss et al., 2000).
Finally, nucleic acids are the most prominent linear lattice systems in the cell. In particular, the chromosomal DNA presents the enormous amount of repeating phosphate groups along its backbone, creating electrostatic potentials for nonspecific protein-DNA interactions (Fig. 1C) (Berg et al., 1981; Stracy et al., 2021). Such polyelectrolyte effect is a major driving force (Lohman et al., 1980; Record et al., 1976), particularly at low salt concentrations, for formation of nucleosomes (Shrader and Crothers, 1989; Widom, 1999) as well as for binding of chromatin architectural proteins such as HMG (high mobility group)-box proteins with little specificities for DNA base sequences (Dragan et al., 2004). Even specific DNA binding proteins typically engage their cationic amino acid side chains to neutralize DNA phosphate charges (Jen-Jacobson et al., 2000; Privalov et al., 2011). Thus, these proteins are expected to interact with nonspecific sites that are present in overwhelming excess over specific site in the chromosomal context. In addition, as the copy numbers of many transcription factors (TFs) are considered greater than those of their corresponding specific binding sites on DNA, the majority of these factors may exist
Taken together, numerous protein-protein and protein-nucleic acid interactions can be perceived as multivalent interactions mediated by linear lattices. Thus, quantitative models for linear lattice systems are indispensable in understanding a broad range of biological processes and may be further extended to dissect more complex systems including phase-separated biomolecular condensates. In this review, we go over two general mathematical frameworks, combinatorial and conditional probability models, for quantitative description of linear lattices. Prior to the detailed derivation of these models, the molecular features of multivalent interactions on a linear lattice will be qualitatively discussed in light of how they are fundamentally different from 1:1 binding or discrete-site systems. The derivation will be supplemented in Supplementary Information with some detailed mathematical procedures omitted but not immediately evident in the original articles. In the end, a couple of practical examples will be discussed where the models have been further extended and applied to highlight their physiological significance. The alternative methods of sequence generating functions and transfer matrix may be referred to the original and case studies for handling multiple binding modes, heterogeneous lattices, and lattice conformational changes (Bujalowski et al., 1989; Lifson, 1964; Schellman, 1974; Teif, 2007).
It is straightforward to derive the quantitative models for the linear lattices that utilize discrete regions or domains to bind multiple distinct target proteins with the interaction stoichiometry of 1:1 for each target. In the absence of cooperativity among bound targets, the binding of each target can be handled, independent of binding of other targets, by the simple mass action law yielding a quadratic equation as a function of total concentrations of the lattice and the corresponding target. An advanced model has been derived by constructing a partition function for a linear lattice with cooperativities among bound targets (Cho et al., 2021).
Complexity arises when a target protein occupies two or more binding motifs on a linear lattice. We consider a linear lattice with a total of
As the linear lattice subsequently binds more target proteins, its overlapping nature generates additional features further deviating from the discrete-site system. The number of potential binding sites eliminated upon binding of a protein depends on where the protein occupies on the lattice. When a protein binds to a gap exactly
The following sections review the quantitative models penetrating the overlap problem of the linear lattice to yield the mathematical formulations relating the binding parameters to experimentally measurable properties of the lattice-target interactions. A core element of each model is the computation of the number of possible configurations for a given density of bound proteins on a lattice.
A complete set of parameters for description of linear lattice-protein interactions consists of the binding stoichiometry (
A fundamental relationship between the binding parameters and experimental variables can be derived by constructing a partition function for a linear lattice (Freire et al., 2009; Wyman and Gill, 1990). The partition function is a sum of relative probabilities or statistical weights of all possible protein-bound states of a linear lattice with a free lattice assigned as a reference state of unit relative probability (i.e., statistical weight = 1). Then, the statistical weight of a lattice with
The average number of proteins bound per lattice (or binding density, ν), which is a principal quantity to be measured in all binding experiments, can be formulated from the partition function:
Likewise, the average number of contact points per lattice can be calculated from a partial derivative of the partition function:
The final task in constructing the partition function is to derive the expression for
In this expression, all runs have been treated as identical elements, regardless of the actual number of bound proteins in each run. Therefore, in order to complete the derivation of
The equation
For noncooperative binding (
Then, the partition function for noncooperative binding can be written in a simplified form:
In practice, the total lattice and protein concentrations ([
For a given set of binding parameters and reactant concentrations, this mass balance equation can be solved for [
Several quantitative frameworks have been proposed to treat an “infinitely” long linear lattice (
where
Referring to Supplementary Information for the detailed mathematical procedures of the derivation, we focus on a few intuitive limiting cases leading to the interpretations of this equation consistent with the molecular features of the linear lattice system (McGhee and von Hippel, 1974).
1) In the case of
Note that, for
2) In the case of
This reduced form simply corresponds to Eq. 11 with
3) Further insight can be provided at the molecular level from the partial derivatives of Eqs. 10b and 11 with respect to
Based on Eq. 10a, the partial derivative can be interpreted as a net change in the average numbers of all three types (Fig. 3A) of binding sites, weighted by their corresponding binding constants, upon binding of one protein to a naked (ν = 0) lattice. As illustrated in Fig. 2B, the binding of a protein to a sufficiently long region eliminates a total of 2
Therefore, in the noncooperative case, the binding of one ligand to a naked lattice simply eliminates 2
Taken together, although the conditional probability method is based on the different conceptual framework as compared to the combinatorial approach, the final formulation provides intuitive interpretations fully consistent with the molecular features of the linear lattice systems. In practice, Eq. 10b is rearranged and incorporated into a mass balance equation relating the binding parameters to the total concentrations of lattice motif and protein ([
Eq. 15e can be numerically solved for
Spatiotemporal regulation of transcription is achieved by interactions between TFs and their specific binding sites on DNA. Because of the enormous number of nonspecific sites on the chromosomal DNA, binding of TFs to these regions must be taken into account to accurately predict the occupancy of the specific sites and thereby the transcription profiles of the corresponding genes (Brewster et al., 2014; von Hippel et al., 1974). In order to recapitulate the essential features of the competition between specific and nonspecific DNA binding, the conditional probability model was extended and applied to a hypothetical two-component (TF and infinitely long DNA with a few embedded specific sites) system. While the 1:1 interaction between TF and a specific site is fully described by the binding constant
where [
At a given specificity ratio and a total motif concentration, as the concentration ratio [
Competitions between specific and nonspecific binding or among multiple nonspecific binding modes have been observed in numerous
IDPs often utilize short peptide motifs to recruit multiple distinct targets or multiple copies of an identical target (Cumberworth et al., 2013; Hong et al., 2020; Wright and Dyson, 2015). These IDPs are collectively termed hubs and involved in signal transduction and macromolecular transport. A representative example is Nup153, a subunit of the NPC, that contains a long C-terminal IDR (~600 amino acids in length) (Krull et al., 2004). The IDR presents multiple FG-motifs to interact with Kaps carrying macromolecular cargos into and out of the nucleus. Multiple hydrophobic pockets on the Kap surface are the primary binding sites for the FG-motifs (Bayliss et al., 2000).
A recent thermodynamic study has developed an advanced combinatorial model to demonstrate that the Nup153 IDR comprises a high-affinity 1:1 binding site and a series of low-affinity sites for binding of multiple Kaps (Fig. 4C) (Cho et al., 2021). Calorimetric data of various protein concentrations and IDR lengths were scrutinized to further show that the overlapping binding of Kaps to the low-affinity sites results in apparent negative cooperativity. Because the Nup153 IDR potentially interacts with nuclear proteins involved in transcription and chromatin organization (Kadota et al., 2020; Kasper et al., 1999), this study has constructed composite combinatorial models to test how the multivalent Kap binding would be affected by competitive binding of nuclear proteins (Fig. 4C). Remarkably, the simulation has revealed that the Kap occupancy of the low-affinity region can be fine-tuned by changing the location of the competitor binding site (Fig. 4C). This delicate modulation arises from the molecular feature of the overlapping binding: The number of potential Kap binding sites eliminated by the competition is determined by the position of the competitor binding site (Fig. 2B). Therefore, assuming that the Kap occupancy is a proxy for the transport activity of the NPC, it is conceivable that the Nup153 IDR functions as a molecular switch coupling specific nuclear processes to distinct transport states. For instance, a strong promoter may be coupled to the NPC activity in such a way that specific TFs or co-activators associated with the strong promoter target a location in the Nup153 IDR that considerably reduces the Kap occupancy (Fig. 4D). As a consequence of the reduced general transport activity mediated by Kaps, a large amount of mRNA transcribed from the strong promoter may be efficiently exported through the NPC (Fig. 4D). Although awaiting experimental validation, the coupling mechanism built upon multivalent, overlapping IDP-target interactions may contribute to the functional versatility of the IDP hubs in dynamic cellular processes. This exemplary study demonstrates that the original combinatorial model can be readily expanded by simple mathematical operations to account for additional complexities in linear lattice-protein interactions including heterogeneous binding sites.
Linear lattice systems and their multivalent interactions with target proteins often regulate dynamic cellular processes. Because of the overlapping target binding sites on a linear lattice, quantitative understanding of such interactions requires a fundamentally different framework as compared to simple 1:1 binding or discrete-site systems. In this review, we discussed the two prevalent approaches in unraveling the linear lattice systems, namely combinatorial and conditional probability models. Constructing the lattice partition functions from the combinatorial approach is straightforward and readily expandable in data analysis and predictions as illustrated in the Nup153 IDR–Kap interaction. On the other hand, the conditional probability model provides invaluable physical insights consistent with the molecular features of the multivalent linear lattice–target interactions. Furthermore, this method is suitable in simulating
This work was supported by Samsung Science & Technology Foundation and Research (SSTF-BA1802-09) and the National Research Foundation (2019R1C1C1011640).
J.C. and J.K. analyzed the data. J.C., R.K., and J.K. wrote the manuscript.
The authors have no potential conflicts of interest to disclose.
Mol. Cells 2022; 45(7): 444-453
Published online July 31, 2022 https://doi.org/10.14348/molcells.2022.0035
Copyright © The Korean Society for Molecular and Cellular Biology.
Jaejun Choi^{1,2 }, Ryeonghyeon Kim^{1,2} , and Junseock Koh^{1,* }
^{1}School of Biological Sciences, Seoul National University, Seoul 08826, Korea, ^{2}These authors contributed equally to this work.
Correspondence to:junseockkoh@snu.ac.kr
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/.
Multivalent macromolecular interactions underlie dynamic regulation of diverse biological processes in ever-changing cellular states. These interactions often involve binding of multiple proteins to a linear lattice including intrinsically disordered proteins and the chromosomal DNA with many repeating recognition motifs. Quantitative understanding of such multivalent interactions on a linear lattice is crucial for exploring their unique regulatory potentials in the cellular processes. In this review, the distinctive molecular features of the linear lattice system are first discussed with a particular focus on the overlapping nature of potential protein binding sites within a lattice. Then, we introduce two general quantitative frameworks, combinatorial and conditional probability models, dealing with the overlap problem and relating the binding parameters to the experimentally measurable properties of the linear lattice-protein interactions. To this end, we present two specific examples where the quantitative models have been applied and further extended to provide biological insights into specific cellular processes. In the first case, the conditional probability model was extended to highlight the significant impact of nonspecific binding of transcription factors to the chromosomal DNA on gene-specific transcriptional activities. The second case presents the recently developed combinatorial models to unravel the complex organization of target protein binding sites within an intrinsically disordered region (IDR) of a nucleoporin. In particular, these models have suggested a unique function of IDRs as a molecular switch coupling distinct cellular processes. The quantitative models reviewed here are envisioned to further advance for dissection and functional studies of more complex systems including phase-separated biomolecular condensates.
Keywords: biological linear lattice, combinatorial model, conditional probability model, multivalent binding, overlapping binding site
Recent advances in cutting-edge biotechnologies have provided opportunities to observe unprecedented molecular details of various biological processes (Ha et al., 2022; Mahamid et al., 2016; Oikonomou and Jensen, 2017; Sigal et al., 2018). Interpretation of such observations requires quantitative models dissecting the underlying macromolecular interactions. In turn, the quantitative information allows further understanding and prediction of spatiotemporal regulation of specific cellular processes in dynamically changing environments. The complexity of macromolecular interactions ranges from simple 1:1 binding to formation of phase-separated condensates with multivalent binding among two or more components (Banani et al., 2017; Lyon et al., 2021; Shin and Brangwynne, 2017). In contrast to the 1:1 binding, multivalent interactions are difficult to describe with the simple mass action law but modeled with more sophisticated frameworks accounting for the presence of various molecular states (Bujalowski, 2006; Freire et al., 2009; Wyman and Gill, 1990). Furthermore, the quantitative models are often formulated with large numbers of parameters, and exemplary cases determining these parameters with suitable
A linear or one-dimensional lattice is a relatively tractable multivalent system found in numerous cellular processes. Linear lattices present multiple binding motifs or domains to interact with diverse proteins or multiple copies of identical proteins (Fig. 1) (Cortese et al., 2008; Dunker et al., 2005; Fung et al., 2018). For instance, in many signaling pathways, scaffold proteins such as axin, BRCA1, and Ste5 recruit various target proteins via specific binding sites (Choi et al., 1994; Mark et al., 2005; Wodarz and Nusse, 1998). These scaffold-driven higher-order assemblies are predicted to colocalize and increase the local concentrations of the target proteins and thereby facilitate their interactions for efficient integration and propagation of diverse signals in the cell (Fig. 1A) (Noutsou et al., 2011; Xue et al., 2013). Another example is the intrinsically disordered regions (IDRs) of some nucleoporins (Nups) present in the nuclear pore complex (NPC) (Fig. 1B) (Frey and Gorlich, 2007; Radu et al., 1995). The Nup IDRs mediate massive yet selective molecular transport between the nucleus and cytoplasm through specific interactions with karyopherin (Kap) proteins carrying macromolecular cargos (Koh and Blobel, 2015; Schoch et al., 2012). These interactions are achieved by multiple interspersed phenylalanine-glycine (FG) motifs on an IDR capturing several Kap molecules (Bayliss et al., 2000).
Finally, nucleic acids are the most prominent linear lattice systems in the cell. In particular, the chromosomal DNA presents the enormous amount of repeating phosphate groups along its backbone, creating electrostatic potentials for nonspecific protein-DNA interactions (Fig. 1C) (Berg et al., 1981; Stracy et al., 2021). Such polyelectrolyte effect is a major driving force (Lohman et al., 1980; Record et al., 1976), particularly at low salt concentrations, for formation of nucleosomes (Shrader and Crothers, 1989; Widom, 1999) as well as for binding of chromatin architectural proteins such as HMG (high mobility group)-box proteins with little specificities for DNA base sequences (Dragan et al., 2004). Even specific DNA binding proteins typically engage their cationic amino acid side chains to neutralize DNA phosphate charges (Jen-Jacobson et al., 2000; Privalov et al., 2011). Thus, these proteins are expected to interact with nonspecific sites that are present in overwhelming excess over specific site in the chromosomal context. In addition, as the copy numbers of many transcription factors (TFs) are considered greater than those of their corresponding specific binding sites on DNA, the majority of these factors may exist
Taken together, numerous protein-protein and protein-nucleic acid interactions can be perceived as multivalent interactions mediated by linear lattices. Thus, quantitative models for linear lattice systems are indispensable in understanding a broad range of biological processes and may be further extended to dissect more complex systems including phase-separated biomolecular condensates. In this review, we go over two general mathematical frameworks, combinatorial and conditional probability models, for quantitative description of linear lattices. Prior to the detailed derivation of these models, the molecular features of multivalent interactions on a linear lattice will be qualitatively discussed in light of how they are fundamentally different from 1:1 binding or discrete-site systems. The derivation will be supplemented in Supplementary Information with some detailed mathematical procedures omitted but not immediately evident in the original articles. In the end, a couple of practical examples will be discussed where the models have been further extended and applied to highlight their physiological significance. The alternative methods of sequence generating functions and transfer matrix may be referred to the original and case studies for handling multiple binding modes, heterogeneous lattices, and lattice conformational changes (Bujalowski et al., 1989; Lifson, 1964; Schellman, 1974; Teif, 2007).
It is straightforward to derive the quantitative models for the linear lattices that utilize discrete regions or domains to bind multiple distinct target proteins with the interaction stoichiometry of 1:1 for each target. In the absence of cooperativity among bound targets, the binding of each target can be handled, independent of binding of other targets, by the simple mass action law yielding a quadratic equation as a function of total concentrations of the lattice and the corresponding target. An advanced model has been derived by constructing a partition function for a linear lattice with cooperativities among bound targets (Cho et al., 2021).
Complexity arises when a target protein occupies two or more binding motifs on a linear lattice. We consider a linear lattice with a total of
As the linear lattice subsequently binds more target proteins, its overlapping nature generates additional features further deviating from the discrete-site system. The number of potential binding sites eliminated upon binding of a protein depends on where the protein occupies on the lattice. When a protein binds to a gap exactly
The following sections review the quantitative models penetrating the overlap problem of the linear lattice to yield the mathematical formulations relating the binding parameters to experimentally measurable properties of the lattice-target interactions. A core element of each model is the computation of the number of possible configurations for a given density of bound proteins on a lattice.
A complete set of parameters for description of linear lattice-protein interactions consists of the binding stoichiometry (
A fundamental relationship between the binding parameters and experimental variables can be derived by constructing a partition function for a linear lattice (Freire et al., 2009; Wyman and Gill, 1990). The partition function is a sum of relative probabilities or statistical weights of all possible protein-bound states of a linear lattice with a free lattice assigned as a reference state of unit relative probability (i.e., statistical weight = 1). Then, the statistical weight of a lattice with
The average number of proteins bound per lattice (or binding density, ν), which is a principal quantity to be measured in all binding experiments, can be formulated from the partition function:
Likewise, the average number of contact points per lattice can be calculated from a partial derivative of the partition function:
The final task in constructing the partition function is to derive the expression for
In this expression, all runs have been treated as identical elements, regardless of the actual number of bound proteins in each run. Therefore, in order to complete the derivation of
The equation
For noncooperative binding (
Then, the partition function for noncooperative binding can be written in a simplified form:
In practice, the total lattice and protein concentrations ([
For a given set of binding parameters and reactant concentrations, this mass balance equation can be solved for [
Several quantitative frameworks have been proposed to treat an “infinitely” long linear lattice (
where
Referring to Supplementary Information for the detailed mathematical procedures of the derivation, we focus on a few intuitive limiting cases leading to the interpretations of this equation consistent with the molecular features of the linear lattice system (McGhee and von Hippel, 1974).
1) In the case of
Note that, for
2) In the case of
This reduced form simply corresponds to Eq. 11 with
3) Further insight can be provided at the molecular level from the partial derivatives of Eqs. 10b and 11 with respect to
Based on Eq. 10a, the partial derivative can be interpreted as a net change in the average numbers of all three types (Fig. 3A) of binding sites, weighted by their corresponding binding constants, upon binding of one protein to a naked (ν = 0) lattice. As illustrated in Fig. 2B, the binding of a protein to a sufficiently long region eliminates a total of 2
Therefore, in the noncooperative case, the binding of one ligand to a naked lattice simply eliminates 2
Taken together, although the conditional probability method is based on the different conceptual framework as compared to the combinatorial approach, the final formulation provides intuitive interpretations fully consistent with the molecular features of the linear lattice systems. In practice, Eq. 10b is rearranged and incorporated into a mass balance equation relating the binding parameters to the total concentrations of lattice motif and protein ([
Eq. 15e can be numerically solved for
Spatiotemporal regulation of transcription is achieved by interactions between TFs and their specific binding sites on DNA. Because of the enormous number of nonspecific sites on the chromosomal DNA, binding of TFs to these regions must be taken into account to accurately predict the occupancy of the specific sites and thereby the transcription profiles of the corresponding genes (Brewster et al., 2014; von Hippel et al., 1974). In order to recapitulate the essential features of the competition between specific and nonspecific DNA binding, the conditional probability model was extended and applied to a hypothetical two-component (TF and infinitely long DNA with a few embedded specific sites) system. While the 1:1 interaction between TF and a specific site is fully described by the binding constant
where [
At a given specificity ratio and a total motif concentration, as the concentration ratio [
Competitions between specific and nonspecific binding or among multiple nonspecific binding modes have been observed in numerous
IDPs often utilize short peptide motifs to recruit multiple distinct targets or multiple copies of an identical target (Cumberworth et al., 2013; Hong et al., 2020; Wright and Dyson, 2015). These IDPs are collectively termed hubs and involved in signal transduction and macromolecular transport. A representative example is Nup153, a subunit of the NPC, that contains a long C-terminal IDR (~600 amino acids in length) (Krull et al., 2004). The IDR presents multiple FG-motifs to interact with Kaps carrying macromolecular cargos into and out of the nucleus. Multiple hydrophobic pockets on the Kap surface are the primary binding sites for the FG-motifs (Bayliss et al., 2000).
A recent thermodynamic study has developed an advanced combinatorial model to demonstrate that the Nup153 IDR comprises a high-affinity 1:1 binding site and a series of low-affinity sites for binding of multiple Kaps (Fig. 4C) (Cho et al., 2021). Calorimetric data of various protein concentrations and IDR lengths were scrutinized to further show that the overlapping binding of Kaps to the low-affinity sites results in apparent negative cooperativity. Because the Nup153 IDR potentially interacts with nuclear proteins involved in transcription and chromatin organization (Kadota et al., 2020; Kasper et al., 1999), this study has constructed composite combinatorial models to test how the multivalent Kap binding would be affected by competitive binding of nuclear proteins (Fig. 4C). Remarkably, the simulation has revealed that the Kap occupancy of the low-affinity region can be fine-tuned by changing the location of the competitor binding site (Fig. 4C). This delicate modulation arises from the molecular feature of the overlapping binding: The number of potential Kap binding sites eliminated by the competition is determined by the position of the competitor binding site (Fig. 2B). Therefore, assuming that the Kap occupancy is a proxy for the transport activity of the NPC, it is conceivable that the Nup153 IDR functions as a molecular switch coupling specific nuclear processes to distinct transport states. For instance, a strong promoter may be coupled to the NPC activity in such a way that specific TFs or co-activators associated with the strong promoter target a location in the Nup153 IDR that considerably reduces the Kap occupancy (Fig. 4D). As a consequence of the reduced general transport activity mediated by Kaps, a large amount of mRNA transcribed from the strong promoter may be efficiently exported through the NPC (Fig. 4D). Although awaiting experimental validation, the coupling mechanism built upon multivalent, overlapping IDP-target interactions may contribute to the functional versatility of the IDP hubs in dynamic cellular processes. This exemplary study demonstrates that the original combinatorial model can be readily expanded by simple mathematical operations to account for additional complexities in linear lattice-protein interactions including heterogeneous binding sites.
Linear lattice systems and their multivalent interactions with target proteins often regulate dynamic cellular processes. Because of the overlapping target binding sites on a linear lattice, quantitative understanding of such interactions requires a fundamentally different framework as compared to simple 1:1 binding or discrete-site systems. In this review, we discussed the two prevalent approaches in unraveling the linear lattice systems, namely combinatorial and conditional probability models. Constructing the lattice partition functions from the combinatorial approach is straightforward and readily expandable in data analysis and predictions as illustrated in the Nup153 IDR–Kap interaction. On the other hand, the conditional probability model provides invaluable physical insights consistent with the molecular features of the multivalent linear lattice–target interactions. Furthermore, this method is suitable in simulating
This work was supported by Samsung Science & Technology Foundation and Research (SSTF-BA1802-09) and the National Research Foundation (2019R1C1C1011640).
J.C. and J.K. analyzed the data. J.C., R.K., and J.K. wrote the manuscript.
The authors have no potential conflicts of interest to disclose.
Sunghyun Hong, Sangmin Choi, Ryeonghyeon Kim, and Junseock Koh
Mol. Cells 2020; 43(11): 899-908 https://doi.org/10.14348/molcells.2020.0186