Intrinsic Protein Disorder, Conditional Folding and AlphaFold2

Since AlphaFold2 “won” the CASP competition in 2020, intrinsically disordered regions (IDRs) have been one of the points of discussion when it comes to AlphaFold2 performance. When the method was published, it was noted that the local predicted LDDT (pLDDT) score correlated well with IDRs. In this short communication, the authors explore this concept further and experiment with other metrics, such as relative solvent accessibility, in order to benchmark IDR prediction by AlphaFold2. Additionally, they also evaluate how pLDDT and relative solvent accessibility can be used to identify regions that fold conditionally upon binding. The conclusion is that AlphaFold2 performs well in the prediction of IDRs but is two orders of magnitude slower and, thus, not the best choice for fast searches.

Structure determination of high-energy states in a dynamic protein ensemble

Like missing those couple of seconds giving out clues about who killed Mr. Thrombey, transitory high-energy conformations can contain crucial information about the mechanism of actions of proteins. In this paper, the authors describe a way to access them using paramagnetic NMR experiments, through the use of pseudocontact shifts (PCS). These PCS provide long-distance structural information that can be used as restraints to determine conformations using simulated annealing, starting from known structures. This method was applied to the adenylate kinase (Adk) metalloprotein enzyme for which only the closed and open conformations are known, and confirmed several observations made by FRET or meta-dynamics simulations. They also show that the method can be generalized to non-metalloproteins by using a metal-binding tag.

Structure-informed microbial population genetics elucidate selective pressures that shape protein evolution

A novel use-case for structure prediction, microbial population genetics. The authors look into evolutionary dynamics in a popular ocean surface microbe taxon through the lens of protein structure as predicted by AlphaFold2. For this, they calculate the rate of synonymous and non-synonymous mutations at each single codon variant site and compare this to relative solvent accessibility and distance from the (predicted) ligand site for the amino acid under consideration. Overall, synonymous variants were way more common than non-synonymous, but the rates of non-synonymous mutations displayed some pretty interesting trends. They were a lot more common in exposed sites vs. buried sites, and they tended to avoid active sites and ligand-binding sites, the first likely due to maintaining structural stability and the second to maintain function. For one protein involved in recycling nitrogen, correctly taking into account its quaternary structure brought forth the expected conclusion that non-synonymous variation again avoids buried and active sites in this protein. This, combined with possible errors in the prediction of ligand binding sites and the steric configurations of residues, means that care needs to be taken in such large-scale structural analyses - though the kinds of conclusions one can draw from them are definitely unique and complementary to the common nucleotide sequence-based analysis.

Figure of the Week

Fig. 3: Linkers L1 and L2 mediate the structural transition to the active state.

a, Overview of the 18–20 MM active conformation. b, c, Detailed view of HNH (b) and RuvC (c) active sites. d, Docking of the L1 linker helix against the PAM-distal gRNA–TS duplex, shown as an isosurface representation. e, Interactions of L1 and L2 regions with the minor groove of the gRNA–TS duplex. HNH extending from L1 and L2 linkers has been removed for clarity and does not interact with this region of the gRNA–TS duplex.

Source: Structural basis for mismatch surveillance by CRISPR–Cas9

Some more