# Analysis¶

## Synopsis¶

This example is continued from Calculations. The aim of this part is to perform a quantitative comparison of experimental and theoretical data and to print/save the numerical data that were presented in [AB09].

We start by importing everything from the ProDy package:

```In [1]: from prody import *

In [2]: from pylab import *

In [3]: ion()
```

Then we load data saved in the previous part:

```In [4]: pca = loadModel('p38_xray.pca.npz')

```

## Variance along PCs¶

Of interest is the fraction of variance that is explained by principal components, which are the dominant modes of variability in the dataset. We can print this information to screen for top 6 PCs as follows:

```In [6]: for mode in pca[:3]:    # Print % variance explained by top PCs
...:     var = calcFractVariance(mode)*100
...:     print('{0:s}  % variance = {1:.2f}'.format(mode, var))
...:
Mode 1 from PCA p38 xray  % variance = 29.19
Mode 2 from PCA p38 xray  % variance = 16.51
Mode 3 from PCA p38 xray  % variance = 10.63
```

These data were included in Table 1 in [AB09].

## Collectivity of modes¶

Collectivity of a normal mode ([BR95]) can be obtained using `calcCollectivity()`:

```In [7]: for mode in pca[:3]:    # Print PCA mode collectivity
...:     coll = calcCollectivity(mode)
...:     print('{0:s}  collectivity = {1:.2f}'.format(mode, coll))
...:
Mode 1 from PCA p38 xray  collectivity = 0.50
Mode 2 from PCA p38 xray  collectivity = 0.50
Mode 3 from PCA p38 xray  collectivity = 0.32
```

Similarly, we can get collectivity of ANM modes:

```In [8]: for mode in anm[:3]:    # Print ANM mode collectivity
...:     coll = calcCollectivity(mode)
...:     print('{0:s}  collectivity = {1:.2f}'.format(mode, coll))
...:
Mode 1 from ANM 1p38  collectivity = 0.65
Mode 2 from ANM 1p38  collectivity = 0.55
Mode 3 from ANM 1p38  collectivity = 0.68
```

This shows that top PCA modes and slow ANM modes are highly collective.

## PCA - ANM overlap¶

We calculate the overlap between PCA and ANM modes in order to see whether structural changes observed upon inhibitor binding correlate with the intrinsic fluctuations of the p38 MAP kinase (Table 1 in [AB09]).

There are a number of functions to calculate or show overlaps between modes (see list of them in Dynamics Analysis). In an interactive session, most useful is `printOverlapTable()`:

```In [9]: printOverlapTable(pca[:3], anm[:3]) # Top 3 PCs vs slowest 3 ANM modes
Overlap Table
ANM 1p38
#1     #2     #3
PCA p38 xray #1   -0.39  +0.04  -0.71
PCA p38 xray #2   -0.78  -0.20  +0.22
PCA p38 xray #3   +0.05  -0.57  +0.06
```

This formatted table can also be written into a file using `writeOverlapTable()` function.

## Save numeric data¶

`ANM` and `PCA` instances store calculated numeric data. Their class documentation lists methods that return eigenvalue, eigenvector, covariance matrix etc. data to the user. Such data can easily be written into text files for analysis using external software. The function is to use is `writeArray()`:

```In [10]: writeArray('p38_PCA_eigvecs.txt', pca.getEigvecs() ) # PCA eigenvectors
Out[10]: 'p38_PCA_eigvecs.txt'

In [11]: writeModes('p38_ANM_modes.txt', anm) # ANM modes, same as using above func
Out[11]: 'p38_ANM_modes.txt'
```

It is also possible to write arbitrary arrays:

```In [12]: overlap = calcOverlap(pca[:3], anm[:3])

In [13]: writeArray('p38_PCA_ANM_overlap.txt', abs(overlap), format='%.2f')
Out[13]: 'p38_PCA_ANM_overlap.txt'
```