Deformation Analysis¶
This example shows how to calculate the deformation vector describing the
change between two structures of a protein. Two structures of the same
protein in PDB format will be used. A Vector
instance that
contains the deformation vector describing the change in protein structure
will be calculated. This object will be compared to ANM
modes.
Parse structures¶
We start by importing everything from the ProDy package:
In [1]: from prody import *
In [2]: from matplotlib.pylab import *
In [3]: ion()
Let’s parse two p38 MAP Kinase structures: 1p38 and 1zz2
In [4]: reference = parsePDB('1p38')
In [5]: mobile = parsePDB('1zz2') # this is the one we want to superimpose
Match chains¶
ProDy offers the function matchChains()
to find matching chains
in two structures easily. We use it to find the chains for which we will
calculate the deformation vector:
In [6]: matches = matchChains(reference, mobile)
matchChains()
function returns a list. If there are no matching chains,
list is empty, else the list contains a tuple for each pair of matching chains.
In [7]: len(matches)
Out[7]: 1
In [8]: match = matches[0]
There is only one match in this case. First item is a subset of atoms from the first structure (reference). Second item is a subset of atoms from the second structure (mobile).
In [9]: ref_chain = match[0]
In [10]: mob_chain = match[1]
Matched atoms are returned in AtomMap
instances.
We can get information on matched subset of atoms by entering the variable
name:
In [11]: ref_chain
Out[11]: <AtomMap: Chain A from 1p38 -> Chain A from 1zz2 from 1p38 (337 atoms)>
In [12]: mob_chain
Out[12]: <AtomMap: Chain A from 1zz2 -> Chain A from 1p38 from 1zz2 (337 atoms)>
Both AtomMap
instances refer to same number of atoms,
and their name suggests how they were retrieved.
In addition, we can find out the sequence identity that the matched atoms (residues) share (third item in the tuple):
In [13]: match[2]
Out[13]: 99.40652818991099
The fourth item in the tuple shows the coverage of the matching:
In [14]: match[3]
Out[14]: 96
This is the percentage of matched residues with respect to the longer chain. 1p38 chain A contains 351 resiudes, 96% of it is 337 residues, which is the number of atoms in the returned atom maps.
RMSD and superpose¶
We calculate the RMSD using calcRMSD()
function:
In [15]: calcRMSD(ref_chain, mob_chain).round(2)
Out[15]: 72.93
Let’s find the transformation that minimizes RMSD between these chains
using calcTransformation()
function:
In [16]: t = calcTransformation(mob_chain, ref_chain)
We apply this transformation to mobile structure (not to mob_chain, to preserve structures integrity).
In [17]: t.apply(mobile)
Out[17]: <AtomGroup: 1zz2 (2872 atoms)>
In [18]: calcRMSD(ref_chain, mob_chain).round(2)
Out[18]: 1.86
Deformation vector¶
Once matching chains are identified it is straightforward to calculate the
deformation vector using calcDeformVector()
In [19]: defvec = calcDeformVector(ref_chain, mob_chain)
In [20]: abs(defvec).round(3)
Out[20]: 34.196
To show how RMSD and deformation vector are related, we can calculate RMSD from the magnitude of the deformation vector:
In [21]: (abs(defvec)**2 / len(ref_chain)) ** 0.5
Out[21]: 1.8628014908695476
Array of numbers for this deformation can be obtained as follows
In [22]: arr = defvec.getArray() # arr is a NumPy array
In [23]: arr.round(2)
Out[23]: array([-1.11, -0.52, -1.89, ..., 0.85, -0.18, 0.54])
Following yields the normalized deformation vector
In [24]: defvecnormed = defvec.getNormed()
In [25]: abs(defvecnormed)
Out[25]: 1.0
Compare with ANM modes¶
Let’s get ANM model for the reference chain using
calcANM()
(a shorthand function for ANM calculations):
In [26]: anm = calcANM(ref_chain)[0]
Calculate overlap between slowest ANM mode and the deformation vector
In [27]: (anm[0] * defvecnormed).round(2) # used normalized deformation vector
Out[27]: -0.42
We can do this for a set of ANM modes (slowest 6) as follows
In [28]: (array(list(anm[:6])) * defvecnormed).astype(float64).round(2)
Out[28]: array([-0.42, -0.14, 0.49, 0.03, -0.17, -0.1 ])