Intermolecular Contacts

This examples shows how to identify intermolecular contacts, e.g. protein atoms interacting with a bound inhibitor. A structure of a protein-ligand complex in PDB format will be used. Output will be Selection instances that points to atoms matching the contact criteria given by the user. Selection instances can be used as input to other functions for further analysis.

Simple contact selections

We start by importing everything from the ProDy package:

In [1]: from prody import *

In [2]: from pylab import *

In [3]: ion()

ProDy selection engine has a powerful feature that enables identifying intermolecular contacts very easily. We will see this by identifying protein atoms interacting with an inhibitor.

We start with parsing a PDB file that contains a protein and a bound ligand.

In [4]: pdb = parsePDB('1zz2')

1zz2 contains an inhibitor bound p38 MAP kinase structure. Residue name of inhibitor is B11. Protein atoms interacting with the inhibitor can simply be identified as follows:

In [5]: contacts = pdb.select('protein and within 4 of resname B11')

In [6]: repr(contacts)
Out[6]: "<Selection: 'protein and wit... of resname B11' from 1zz2 (50 atoms)>"

'protein and within 4 of resname B11' is interpreted as select protein atoms that are within 4 A of residue whose name is B11. This selects protein atoms that within 4 A of the inhibitor.

Contacts between different atom groups

In some cases, the protein and the ligand may be in separate files. We will imitate this case by making copies of protein and ligand.

In [7]: inhibitor = pdb.select('resname B11').copy()

In [8]: repr(inhibitor)
Out[8]: "<AtomGroup: 1zz2 Selection 'resname B11' (33 atoms)>"

In [9]: protein = pdb.select('protein').copy()

In [10]: repr(protein)
Out[10]: "<AtomGroup: 1zz2 Selection 'protein' (2716 atoms)>"

We see that inhibitor molecule contains 33 atoms.

Now we have two different atom groups, and we want protein atoms that are within 4 Å of the inhibitor.

In [11]: contacts = protein.select('within 4 of inhibitor', inhibitor=inhibitor)

In [12]: repr(contacts)
Out[12]: "<Selection: 'index 227 230 2... 1354 1356 1358' from 1zz2 Selection 'protein' (50 atoms)>"

We found that 50 protein atoms are contacting with the inhibitor. In this case, we passed the atom group inhibitor as a keyword argument to the selection function. Note that the keyword must match that is used in the selection string.

Composite contact selections

Now, let’s try something more sophisticated. We select Cα atoms of residues that have at least one atom interacting with the inhibitor:

In [13]: contacts_ca = protein.select(
   ....:    'calpha and (same residue as within 4 of inhibitor)',
   ....:    inhibitor=inhibitor)
   ....: 

In [14]: repr(contacts_ca)
Out[14]: "<Selection: 'index 225 232 2... 1328 1351 1359' from 1zz2 Selection 'protein' (20 atoms)>"

In this case, 'calpha and (same residue as within 4 of inhibitor)' is interpreted as select Cα atoms of residues that have at least one atom within 4 A of any inhibitor atom.

This shows that, 20 residues have atoms interacting with the inhibitor.

Spherical atom selections

Similarly, one can give arbitrary coordinate arrays as keyword arguments to identify atoms in a spherical region. Let’s find backbone atoms within 5 Å of point (25, 73, 13):

In [15]: sel = protein.select('backbone and within 5 of somepoint',
   ....:                      somepoint=np.array((25, 73, 13)))
   ....: 

Fast contact selections

For repeated and faster contact identification Contacts class is recommended.

We pass the protein as argument:

In [16]: protein_contacts = Contacts(protein)

The following corresponds to "within 4 of inhibitor":

In [17]: contants = protein_contacts.select(4, inhibitor)

In [18]: repr(contacts)
Out[18]: "<Selection: 'index 227 230 2... 1354 1356 1358' from 1zz2 Selection 'protein' (50 atoms)>"

This method is 20 times faster than the one in the previous part, but it is limited to selecting only contacting atoms (other selection arguments cannot be passed). Again, it should be noted that Contacts does not update the KDTree that it uses, so it should be used if protein coordinates does not change between selections.