Comparison Functions

This module defines functions for comparing normal modes from different models.

calcOverlap(rows, cols, diag=False)[source]

Returns overlap (or correlation) between two sets of modes (rows and cols). Returns a matrix whose rows correspond to modes passed as rows argument, and columns correspond to those passed as cols argument. Both rows and columns are normalized prior to calculating overlap.

This function can now return the diagonal of the overlap matrix if diag is set to True.

calcCumulOverlap(modes1, modes2, array=False)[source]

Returns cumulative overlap of modes in modes2 with those in modes1. Returns a number of modes1 contains a single Mode or a Vector instance. If modes1 contains multiple modes, returns an array. Elements of the array correspond to cumulative overlaps for modes in modes1 with those in modes2. If array is True, returns an array of cumulative overlaps. Returned array has the shape (len(modes1), len(modes2)). Each row corresponds to cumulative overlaps calculated for modes in modes1 with those in modes2. Each value in a row corresponds to cumulative overlap calculated using upto that many number of modes from modes2.

calcSubspaceOverlap(modes1, modes2)[source]

Returns subspace overlap between two sets of modes (modes1 and modes2). Also known as the root mean square inner product (RMSIP) of essential subspaces [AA99]. This function returns a single number.

[AA99]Amadei A, Ceruso MA, Di Nola A. On the convergence of the conformational coordinates basis set obtained by the essential dynamics analysis of proteins’ molecular dynamics simulations. Proteins 1999 36(4):419-424.
calcSpectralOverlap(modes1, modes2, weighted=False, turbo=False)[source]

Returns overlap between covariances of modes1 and modes2. Overlap between covariances are calculated using normal modes (eigenvectors), hence modes in both models must have been calculated. This function implements equation 11 in [BH02].

[BH02](1, 2) Hess B. Convergence of sampling in protein simulations. Phys Rev E 2002 65(3):031910.
Parameters:weighted (bool) – if True then covariances are weighted by the trace.
calcCovOverlap(modes1, modes2, turbo=False)[source]

Returns overlap between covariances of modes1 and modes2. Overlap between covariances are calculated using normal modes (eigenvectors), hence modes in both models must have been calculated. This function implements equation 11 in [BH02].

printOverlapTable(rows, cols)[source]

Print table of overlaps (correlations) between two sets of modes. rows and cols are sets of normal modes, and correspond to rows and columns of the printed table. This function may be used to take a quick look into mode correspondences between two models.

>>> # Compare top 3 PCs and slowest 3 ANM modes
>>> printOverlapTable(p38_pca[:3], p38_anm[:3]) 
Overlap Table
                        ANM 1p38
                    #1     #2     #3
PCA p38 xray #1   -0.39  +0.04  -0.71
PCA p38 xray #2   -0.78  -0.20  +0.22
PCA p38 xray #3   +0.05  -0.57  +0.06
writeOverlapTable(filename, rows, cols)[source]

Write table of overlaps (correlations) between two sets of modes to a file. rows and cols are sets of normal modes, and correspond to rows and columns of the overlap table. See also printOverlapTable().

calcSquareInnerProduct(modes1, modes2)[source]

Returns the square inner product (SIP) of fluctuations [SK02]. This function returns a single number.

[SK02]Kundu S, Melton JS, Sorensen DC, Phillips GN: Dynamics of proteins in crystals: comparison of experiment with simple models. Biophys J. 2002, 83: 723-732.
pairModes(modes1, modes2, **kwargs)[source]

Returns the optimal matches between modes1 and modes2. modes1 and modes2 should have equal number of modes, and the function will return a nested list where each item is a list containing a pair of modes.

Parameters:index (bool) – if True then indices of modes will be returned instead of Mode instances.
matchModes(*modesets, **kwargs)[source]

Returns the matches of modes among modesets. Note that the first modeset will be treated as the reference so that only the matching of each modeset to the first modeset is guaranteed to be optimal.

Parameters:
  • index (bool) – if True then indices of modes will be returned instead of Mode instances
  • turbo (bool, int) – if True then the computation will be performed in parallel. The number of threads is set to be the same as the number of CPUs. Assigning a number to specify the number of threads to be used. Note that if writing a script, if __name__ == '__main__' is necessary to protect your code when multi-tasking. See https://docs.python.org/2/library/multiprocessing.html for details. Default is False