"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Python](https://www.python.org/) is an interpreted language originally designed for general-purpose programming, but it also became a full-fledged tool for scentific computing thanks to its highly modularized and extensible design, and a growing scentific user/developer community. The first part of this tutorial is to show or remind you some basic syntax of Python."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print('hello world!')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The real power of Python is with the packages. There are many built-in ones, just like the one we will see below, and also many open-source packages developed by the Python community. There are several ways to import a package (module). For starters, you can import the module by its name:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"np.pi"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Another common way to import is using a wildcard (namely `*`):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from numpy import *\n",
"pi"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Data types"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just like other languages, Python has several basic data types: `int`, `float`, `bool`, `str`, to define a variable of a certain type, one can simply write out the initial value:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"i = 1\n",
"f = 3.1415\n",
"b = True or False\n",
"s = 'hello world!'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To create a list:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a = [0, 1, 2, 3, 4]\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Notably the array in Python starts from 0. A list can be also generated by `range` function, minimally the only input is the **length** of the list. So to generate a list like above, we can write:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a = range(5)\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that in Python 3K, this will generate an iterator (a [Range](https://docs.python.org/3/library/stdtypes.html#typesseq-range) class to be precise). Following code will convert it to a `list` instance:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a = list(range(5))\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Optionally*, the most robust and universal way to convert a iterable object to a list is through [list comprehension](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"a = [_ for _ in range(5)]\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Control flow"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The list comprehension mentioned above is essentially an one liner of a `for` loop. The full version of the `for` loop looks like the following:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for i in range(5):\n",
" print(i)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Which iterates through numbers from 0 to 4. `if` block is used for making choices. Let's first initialize two integers:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"a = 3; b = -3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then write out a control block as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if a > b:\n",
" print('a is bigger than b')\n",
"elif a < b:\n",
" print('a is smaller than b')\n",
"else:\n",
" print('a equals to b')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### NumPy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[NumPy](http://www.numpy.org/) is a Python package for numerical computing and linear algebra. It is also a required installation for ProDy."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from numpy import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The most fundamental data type in numpy is `ndarray` (n-dimensional array). It can be used to represent vectors or matrices. There are various ways to initialize an vector in numpy:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"v = arange(10, dtype=float)\n",
"v"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"v = zeros(5)\n",
"v"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"v.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or a matrix:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"V = zeros([5, 5])\n",
"V"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"V.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"V = ones([5, 5])\n",
"V"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"I = eye(5)\n",
"I"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Matplotlib"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Matplotlib](https://matplotlib.org/) is a package used by ProDy to visualize the data. It is compatible with Python objects as well as numpy objects and provides [MATLAB](https://www.mathworks.com/products/matlab.html)-like syntax for plotting data points, matrices, 3D coordinates, etc."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from matplotlib.pyplot import *"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"x = arange(0, 2*pi, 0.1)\n",
"y = sin(x)\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot(x, y)\n",
"xlabel('x')\n",
"ylabel('y')\n",
"title('sin')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y2 = cos(x)\n",
"plot(x, y, x, y2)\n",
"xlabel('x')\n",
"ylabel('y')\n",
"legend(['sin', 'cos'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot(y, y2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
Prody Basics
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial aims to teach basic data structures and functions in Prody. First we need to import required packages:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from prody import *\n",
"from numpy import *\n",
"from matplotlib.pyplot import *\n",
"%matplotlib inline\n",
"confProDy(auto_show=False)\n",
"confProDy(auto_secondary=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These import commands will load numpy, matplotlib, and ProDy into the memory. `confProDy` is used to modify the default behaviors of ProDy. Here we turned off `auto_show` so that the plots can be made in the same figure, and we turn on `auto_secondary` to parse the secondary structure information whenever we load a PDB into ProDy. See [here](http://prody.csb.pitt.edu/manual/reference/prody.html?highlight=confprody#prody.confProDy) for a complete list of behaviors that can be changed by this function. This function only needs to be called once, and the setting will be remembered by ProDy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load PDB files and visualization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"ProDy comes with many functions that can be used to fetch data from [Protein Data Bank](https://www.rcsb.org/). Let's first parse a structure:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38 = parsePDB('1p38')\n",
"p38"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`parsePDB` will download the PDB file and load it into the memory. Let's inspect the variable `p38`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To visualize the structure:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showProtein(p38);\n",
"legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you would like to display the 3D structure using other packages or your own code, you can get the 3D coordinates via:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38.getCoords()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38.getCoords().shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showContactMap(p38.ca);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`AtomGroup` is essentially a collection of protein atoms. Each atom can be indexed/queried/found by the following way:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38[10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This will give you the **11-th** atom from `p38`, noting that Python index starts from **0**. We can also examine the spatial location of this atom by querying the coordinate:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38[10].getCoords()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showProtein(p38);\n",
"ax3d = gca()\n",
"x, y, z = p38[10].getCoords()\n",
"ax3d.plot([x], [y], [z], 'r*', markersize=30);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could select a chain, e.g. chain A, of the protein by indexing using its identifier, like the following:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38['A']"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38['A'].getSequence()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In many cases, it is more convenient to examine the structure with **[residue numbers](https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/primary-sequences-and-the-pdb-format)**, and `AtomGroup` supports indexing with a chain ID and a residue number:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38[10].getResnum()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38['A', 5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This will give you the residue with the residue number of 10, which is an arginine in `p38`. Please note the difference between this line and the previous one. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38['A', 5].getNames()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that some ProDy objects may not support indexing using a chain identifier or a residue number. In such cases, we can first obtain a hierarchical view of the object:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hv = p38.getHierView()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And then use `HierView` to index using a chain identifier and residue number as it will always be supported:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hv['A', 5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Retrive data from AtomGroup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Many properties of the protein can be acquired by functions named like \"getxxx\". For instance, we can obtain the B-factors by:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"betas = p38.getBetas()\n",
"betas.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this way, we can obtain the B-factor for every single atom. However, in some cases, we only need to know the B-factors of alpha-carbons, "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38.ca"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"betas = p38.ca.getBetas()\n",
"betas.shape"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot(betas);\n",
"ylabel('B-factor');\n",
"xlabel('Residue index');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we would like to use residue numbers in the PDB, instead of the indices, as the x-axis of the plot, it would be much more convenient to use the ProDy plotting function, `showAtomicLines`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showAtomicLines(betas, atoms=p38.ca);\n",
"ylabel('B-factor');\n",
"xlabel('Residue number');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also obtain the secondary structure information as an array:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38.ca.getSecstrs()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To make it easier to read, we can convert the array into a string using the Python's built-in function, `join` :"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"''.join(p38.ca.getSecstrs())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`C` is for coil, `H` for alpha helix, `I` for pi helix, `G` for 3-10 helix, and `E` for beta strand (sheet). To get a complete list of \"get\" function, you can type `p38.get`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Speaking of which, in [measure](http://prody.csb.pitt.edu/manual/reference/measure/index.html?highlight=measure#module-prody.measure) module, you can find various functions for calculations for structural properties. For example, you can calculate the phi angle of 11th residue:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"calcPhi(p38['A', 10])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"round(calcPhi(p38['A', 10]), 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A [dihedral angle](https://en.wikipedia.org/wiki/Dihedral_angle) is the angle between two intersecting planes. In chemistry it is the angle between planes through two sets of three atoms, having two atoms in common. In proteins, there are two most interested dihedral angles, namely Phi and Psi, and they are illustrated as follows.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the residue at N-terminus or C-terminus does not have a Phi or Psi angle, respectively. If we calculate the Phi and Psi angle for every non-terminal residue, we can obtain a [Ramachandran plot](https://en.wikipedia.org/wiki/Ramachandran_plot) for a protein. An example of Ramachandran plot for human [PCNA](https://en.wikipedia.org/wiki/Proliferating_cell_nuclear_antigen) is shown as follows:\n",
"\n",
"\n",
"Three favored regions (in red)--**upper left: beta sheet; center left: alpha helix; center right: left-handed helix**. Each blue data point corresponds to the two dihedrals of a residue. We will reproduce this plot for ubiquitin (we will only reproduce the points)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chain = p38['A']\n",
"Phi = []; Psi = []; c = []\n",
"for res in chain.iterResidues():\n",
" try:\n",
" phi = calcPhi(res)\n",
" psi = calcPsi(res)\n",
" except:\n",
" continue\n",
" else:\n",
" Phi.append(phi)\n",
" Psi.append(psi)\n",
" if res.getResname() == 'GLY':\n",
" c.append('black')\n",
" else:\n",
" secstr = res.getSecstrs()[0]\n",
" if secstr == 'H':\n",
" c.append('red')\n",
" elif secstr == 'G':\n",
" c.append('darkred')\n",
" elif secstr == 'E':\n",
" c.append('blue')\n",
" else:\n",
" c.append('grey')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above code, we use an exception handler to exclude the terminal residues from the calculation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scatter(Phi, Psi, c=c, s=10);\n",
"xlabel('Phi (degree)');\n",
"ylabel('Psi (degree)');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Selection"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In theory you could retrieve any set of atoms by indexing the `AtomGroup`, but it would be cumbersome to do so. To make it more convienient, ProDy provides VMD-like syntax for selecting atoms. Here lists a few common selection strings, and for a more complete tutorial on selection, please see [here](http://prody.csb.pitt.edu/tutorials/prody_tutorial/selection.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ca = p38.select('calpha')\n",
"ca"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"p38.ca"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bb = p38.select('backbone')\n",
"p38.bb"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could also perform some simple selections right when the structure is being parsed. For example, we can specify that we would like to obtain only alpha-carbons of chain A of the p38 as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chainA_ca = parsePDB('1p38', chain='A', subset='ca')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We could find the chain A using selection (as an alternative to the indexing method shown above):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"chA = p38.select('calpha and chain A')\n",
"chA"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Selection also works for finding a single residue or multiple residues:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"res = p38.ca.select('chain A and resnum 10')\n",
"res.getResnums()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"res = p38.ca.select('chain A and resnum 10 11 12')\n",
"res.getResnums()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"head = p38.ca.select('resnum < 50')\n",
"head.numAtoms()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also select a range of residues by:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fragment = p38.ca.select('resnum 50 to 100')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we have data associated to the full length of the protein, we can slice the data using the `sliceAtomicData`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"subbetas = sliceAtomicData(betas, atoms=p38.ca, select=fragment)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can visualize the data of this range using `showAtomicLines`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showAtomicLines(subbetas, atoms=fragment);\n",
"xlabel('Residue number');\n",
"ylabel('B factor');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or highlight the subset in the plot of the whole protein:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showAtomicLines(betas, atoms=p38.ca, overlay=True);\n",
"showAtomicLines(subbetas, atoms=fragment, overlay=True);\n",
"xlabel('Residue number');\n",
"ylabel('B factor');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Selection also allows us to extract particular amino acid types:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"args = p38.ca.select('resname ARG')\n",
"args"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, combined with `sliceAtomicData` and `showAtomicLines`, we can highlight these residues in the plot of the whole protein:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"argbetas = sliceAtomicData(betas, atoms=p38.ca, select=args)\n",
"showAtomicLines(betas, atoms=p38.ca, overlay=True);\n",
"showAtomicLines(argbetas, atoms=args, linespec='r*', overlay=True);\n",
"xlabel('Residue number');\n",
"ylabel('B factor');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare and align structures"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also compare different structures using some of the methods in proteins module. Let’s parse another p38 MAP kinase structure. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bound = parsePDB('1zz2')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can find similar chains in structure 1p38 and 1zz2 using `matchChains` function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"results = matchChains(p38, bound)\n",
"results[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In Python, a tuple (or any indexable objects) can be unpacked by:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"apo_chA, bnd_chA, seqid, overlap = results[0]\n",
"apo_chA"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bnd_chA"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first two terms are the mapping of the proteins to each other. Then the third term is the sequence identity:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"seqid"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and the forth term is the sequence coverage:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"overlap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If we calculate RMSD right now, we will obtain the value for the unsuperposed proteins:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"calcRMSD(bnd_chA, apo_chA)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After superposition, the RMSD will be much improved,"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"bnd_chA, transformation = superpose(bnd_chA, apo_chA)\n",
"calcRMSD(bnd_chA, apo_chA)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"confProDy(auto_show=False)\n",
"showProtein(bnd_chA);\n",
"showProtein(apo_chA);\n",
"legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To visualize the superposition of the full proteins, we need to apply transform matrix to the entire structure:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"showProtein(p38);\n",
"showProtein(bound);\n",
"legend();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Advanced Visualization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using `matplotlib`, we only obtained a very simple linear representation of proteins. ProDy also support a more sophisticated way of visualizing proteins in 3D via [py3Dmol](http://3dmol.csb.pitt.edu/):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import py3Dmol\n",
"showProtein(p38)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The limitation is that `py3Dmol` only works in an iPython notebook. You can always write out the protein to a PDB file and visualize it in an external program:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"writePDB('bound_aligned.pdb', bnd_chA)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 1
}