Deformation Analysis

This example shows how to calculate the deformation vector describing the change between two structures of a protein. Two structures of the same protein in PDB format will be used. A Vector instance that contains the deformation vector describing the change in protein structure will be calculated. This object will be compared to ANM modes.

Parse structures

We start by importing everything from the ProDy package:

In [1]: from prody import *

In [2]: from matplotlib.pylab import *

In [3]: ion()

Let’s parse two p38 MAP Kinase structures: 1p38 and 1zz2

In [4]: reference = parsePDB('1p38')

In [5]: mobile = parsePDB('1zz2')   # this is the one we want to superimpose

Match chains

ProDy offers the function matchChains() to find matching chains in two structures easily. We use it to find the chains for which we will calculate the deformation vector:

In [6]: matches = matchChains(reference, mobile)

matchChains() function returns a list. If there are no matching chains, list is empty, else the list contains a tuple for each pair of matching chains.

In [7]: len(matches)
Out[7]: 1

In [8]: match = matches[0]

There is only one match in this case. First item is a subset of atoms from the first structure (reference). Second item is a subset of atoms from the second structure (mobile).

In [9]: ref_chain = match[0]

In [10]: mob_chain = match[1]

Matched atoms are returned in AtomMap instances. We can get information on matched subset of atoms by entering the variable name:

In [11]: ref_chain
Out[11]: <AtomMap: Chain A from 1p38 -> Chain A from 1zz2 from 1p38 (337 atoms)>

In [12]: mob_chain
Out[12]: <AtomMap: Chain A from 1zz2 -> Chain A from 1p38 from 1zz2 (337 atoms)>

Both AtomMap instances refer to same number of atoms, and their name suggests how they were retrieved.

In addition, we can find out the sequence identity that the matched atoms (residues) share (third item in the tuple):

In [13]: match[2]
Out[13]: 99.40652818991099

The fourth item in the tuple shows the coverage of the matching:

In [14]: match[3]
Out[14]: 96

This is the percentage of matched residues with respect to the longer chain. 1p38 chain A contains 351 resiudes, 96% of it is 337 residues, which is the number of atoms in the returned atom maps.

RMSD and superpose

We calculate the RMSD using calcRMSD() function:

In [15]: calcRMSD(ref_chain, mob_chain).round(2)
Out[15]: 72.930000000000007

Let’s find the transformation that minimizes RMSD between these chains using calcTransformation() function:

In [16]: t = calcTransformation(mob_chain, ref_chain)

We apply this transformation to mobile structure (not to mob_chain, to preserve structures integrity).

In [17]: t.apply(mobile)
Out[17]: <AtomGroup: 1zz2 (2872 atoms)>

In [18]: calcRMSD(ref_chain, mob_chain).round(2)
Out[18]: 1.8600000000000001

Deformation vector

Once matching chains are identified it is straightforward to calculate the deformation vector using calcDeformVector()

In [19]: defvec = calcDeformVector(ref_chain, mob_chain)

In [20]: abs(defvec).round(3)
Out[20]: 34.195999999999998

To show how RMSD and deformation vector are related, we can calculate RMSD from the magnitude of the deformation vector:

In [21]: (abs(defvec)**2 / len(ref_chain)) ** 0.5
Out[21]: 1.86280149086955

Array of numbers for this deformation can be obtained as follows

In [22]: arr = defvec.getArray() # arr is a NumPy array

In [23]: arr.round(2)
Out[23]: array([-1.11, -0.52, -1.89, ...,  0.85, -0.18,  0.54])

Following yields the normalized deformation vector

In [24]: defvecnormed = defvec.getNormed()

In [25]: abs(defvecnormed)
Out[25]: 1.0

Compare with ANM modes

Let’s get ANM model for the reference chain using calcANM() (a shorthand function for ANM calculations):

In [26]: anm = calcANM(ref_chain)[0]

Calculate overlap between slowest ANM mode and the deformation vector

In [27]: (anm[0] * defvecnormed).round(2) # used normalized deformation vector
Out[27]: -0.41999999999999998

We can do this for a set of ANM modes (slowest 6) as follows

In [28]: (array(list(anm[:6])) * defvecnormed).astype(float64).round(2)
Out[28]: array([-0.42, -0.14,  0.49,  0.03, -0.17, -0.1 ])