The potential energy of a Gaussian Network Model is defined as the summation of harmonic potentials over all unique $(i,j)$-pairs and is a function of only the square of inter-residue distance vector, $\Delta \mathbf{R}_{ij} = \mathbf{R}_i - \mathbf{R}_j^0$. It can be given in terms of the Kirchhoff matrix for inter-residue contacts, $\mathbf{\Gamma}$,
$$ V^{GNM} = \frac{\gamma}{2} \Delta \mathbf{R}^T \left( \mathbf{\Gamma} \otimes \mathbf{I}_{3\times 3} \right) \Delta \mathbf{R} \,, $$where $\mathbf{\Gamma}$ is an $(N \times N)$-matrix and is defined as:
$$ \mathbf{\Gamma}_{ij} = \left\{\begin{matrix} -1 & \text{if } i \ne j \text{ and }R_{ij} \le R_{cut} \,, \\ 0 & \text{if } i \ne j \text{ and }R_{ij} > R_{cut} \,, \\ -\sum_{j,j \ne i}^{N} \mathbf{\Gamma}_{ij} & \text{if } i = j \,. \end{matrix}\right. $$First, let's import all the required packages!
from numpy import *
from matplotlib.pyplot import *
from prody import *
confProDy(auto_show=False)
confProDy(auto_secondary=True)
Tip: how to access the documentation
parsePDB?
Parse the PDB structure for adenylate kinase.
ake = parsePDB('4ake', compressed=False)
ake
figure(dpi=100)
showProtein(ake);
This PDB file contains 2 chains. If we only want to use $C^{\alpha}$ atoms from chain A, then we can select it as:
calphas = ake.select('calpha and chain A')
calphas
calphas.numAtoms()
figure(dpi=100)
showProtein(calphas);
See also the online tutorial.
First, let's instantiate an object of GNM class.
gnm = GNM('AKE') # assign a name
Now, we can build the Kirchhoff matrix by the method buildKirchhoff
.
What are the parameters of this method?
gnm.buildKirchhoff?
Let's build Kirchhoff matrix based on the coordinate of $C^{\alpha}$-atoms already selected.
gnm.buildKirchhoff(calphas)
We can get the Kirchhoff matrix by the following method:
gnm.getKirchhoff()
Check that every row sum is equal to 0 (since it is a Kirchhoff matrix!):
apply_along_axis(sum, 1, gnm.getKirchhoff())
Degrees of residues:
diag(gnm.getKirchhoff())
We can calculate normal modes from the Kirchhoff matrix by the method calcModes. Many parameters in this function can be adjusted, e.g. the number of modes: calcModes( ) doc
gnm.calcModes?
gnm.calcModes(n_modes=None) # or: (n_modes='all')
gnm.calcModes(n_modes=10)
gnm.calcModes() # n_modes=20 (default)
gnm.calcModes(n_modes=None, zeros=True)
The algebraic multiplicity of zero eigenvalues determines separate connected components of your graph.
If it is equal to one, then your graph does not have a disconnected piece!
Get eigenvalues and eigenvectors:
gnm.getEigvals().round(3)
gnm.getEigvecs()
NB: The eigenvector corresponding to 0 eigenvalue is always constant.
gnm.getEigvecs()[:, 0]
How about the variances:
gnm.getVariances()[1:].round(3)
NB: A more compact way to compute GNM is to use the function calcGNM( ) doc
gnm.calcModes()
The contact map can be visualized with the command:
figure(dpi=100)
showContactMap(gnm, cmap='Greys');
... while the shape of normal modes can be visualized by typing:
figure(dpi=100)
showMode(gnm[0]);
We can show possible hinge sites on that plot as well.
figure(dpi=100)
showMode(gnm[0], hinges=True);
Furthermore, you can also access hinge sites identified from multiple modes (e.g. 2 modes) by:
calcHinges(gnm[:2])
Many other visualization tools can be found online at the page Plotting Functions.
Let's get the covariance matrix first, $C_{ij} \propto \sum_{i=1}^{N-1} \lambda_i^{-1} \mathbf{u}_i \mathbf{u}_i^T$:
cov = gnm.getCovariance()
cov.shape
cov
Alternative ways to compute the covariance and (normalized/orientational) cross-correlation matrix, $C_{ij}^{(n)} = C_{ij} \big/ \left(C_{ii} C_{jj} \right)^{1/2}$:
C1 = calcCovariance(gnm)
# again covariance matrix for GNM, not for ANM!
C2 = calcCrossCorr(gnm, norm=False)
# normalized/orientational cross-correlation matrix
C3 = calcCrossCorr(gnm)
all(C1 == cov)
all(C2 == cov)
all(C3 == cov)
C3
We can visualize the normalized/orientational cross-correlations between residue fluctuations with the command:
figure(dpi=100)
showCrossCorr(gnm);
The residues' mean square fluctuations (MSF) are found on the diagonal of the covariance matrix and describe the residue mobility or flexibility.
# MSFs from the "computed" modes,
# and here the total number of calculated modes is 20.
figure(dpi=100)
showSqFlucts(gnm);
Moreover, you can specify the number of modes to compute the MSFs to see either their indivudual or cumulative contributions visually as well!
# square fluctuations computed from 2 slowest (lowest-energy) modes
figure(dpi=100)
showSqFlucts(gnm[:2])
grid();
We can also see how well the computed MSF correlates with the experimental B-factors (Debye-Waller) extracted from the PDB file. It is better to use all modes for this aim.
gnm.calcModes(None)
# rescale MSFs
bfactors = calphas.getBetas()
msfs = calcSqFlucts(gnm)
msfs = msfs / mean(msfs) * mean(bfactors)
figure(figsize=(9, 5), dpi=300)
plot(bfactors, 'orange', label='Experimental')
plot(msfs, 'g', lw=1., label='GNM')
grid()
legend()
tight_layout();