Supporting Functions

This module defines a functions for handling conformational ensembles.

saveEnsemble(ensemble, filename=None, **kwargs)[source]

Save ensemble model data as filename.ens.npz. If filename is None, title of the ensemble will be used as the filename, after white spaces in the title are replaced with underscores. Extension is .ens.npz. Upon successful completion of saving, filename is returned. This function makes use of numpy.savez() function.

loadEnsemble(filename)[source]

Returns ensemble instance loaded from filename. This function makes use of numpy.load() function. See also saveEnsemble()

trimPDBEnsemble(pdb_ensemble, **kwargs)[source]

Returns a new PDB ensemble obtained by trimming given pdb_ensemble. This function helps selecting atoms in a pdb ensemble based on one of the following criteria, and returns them in a new PDBEnsemble instance.

Occupancy

Resulting PDB ensemble will contain atoms whose occupancies are greater or equal to occupancy keyword argument. Occupancies for atoms will be calculated using calcOccupancies(pdb_ensemble, normed=True).

Parameters:
  • occupancy (float) – occupancy for selecting atoms, must satisfy 0 < occupancy <= 1
  • selstr (str) – The function will trim residues that are NOT specified by the selection string.
calcOccupancies(pdb_ensemble, normed=False)[source]

Returns occupancy calculated from weights of a PDBEnsemble. Any non-zero weight will be considered equal to one. Occupancies are calculated by binary weights for each atom over the conformations in the ensemble. When normed is True, total weights will be divided by the number of atoms. This function can be used to see how many times a residue is resolved when analyzing an ensemble of X-ray structures.

showOccupancies(pdbensemble, *args, **kwargs)[source]

Show occupancies for the PDB ensemble using plot(). Occupancies are calculated using calcOccupancies().

alignPDBEnsemble(ensemble, suffix='_aligned', outdir='.', gzip=False)[source]

Align PDB files using transformations from ensemble, which may be a PDBEnsemble or a PDBConformation instance. Label of the conformation (see getLabel()) will be used to determine the PDB structure and model number. First four characters of the label is expected to be the PDB identifier and ending numbers to be the model number. For example, the Transformation from conformation with label 2k39_ca_selection_’resnum_<_71’_m116 will be applied to 116th model of structure 2k39. After applicable transformations are made, structure will be written into outputdir as 2k39_aligned.pdb. If gzip is True, output files will be compressed. Return value is the output filename or list of filenames, in the order files are processed. Note that if multiple models from a file are aligned, that filename will appear in the list multiple times.

calcTree(ensemble, distance_matrix)[source]

Given a distance matrix for an ensemble, it creates an returns a tree structure. :arg ensemble: an ensemble with labels. :type ensemble: prody.ensemble.Ensemble or prody.ensemble.PDBEnsemble :arg distance_matrix: a square matrix with length of ensemble. If numbers does not mismatch it will raise an error. :type distance_matrix: numpy.ndarray

buildPDBEnsemble(refpdb, PDBs, title='Unknown', labels=None, seqid=94, coverage=85, mapping_func=<function mapOntoChain>, occupancy=None, unmapped=None)[source]

Builds a PDB ensemble from a given reference structure and a list of PDB structures. Note that the reference structure should be included in the list as well.

Parameters:
  • refpdb (Chain, Selection, or AtomGroup) – Reference structure
  • PDBs (iterable) – A list of PDB structures
  • title (str) – The title of the ensemble
  • labels (list) – labels of the conformations
  • seqid (int) – Minimal sequence identity (percent)
  • coverage (int) – Minimal sequence overlap (percent)
  • occupancy – Minimal occupancy of columns (range from 0 to 1). Columns whose occupancy

is below this value will be trimmed. :type occupancy: float

Parameters:unmapped – A list of PDB IDs that cannot be included in the ensemble. This is an

output argument. :type unmapped: list

addPDBEnsemble(ensemble, PDBs, refpdb=None, labels=None, seqid=94, coverage=85, mapping_func=<function mapOntoChain>, occupancy=None, unmapped=None)[source]

Adds extra structures to a given PDB ensemble.

Parameters:
  • ensemble (PDBEnsemble) – The ensemble to which the PDBs are added.
  • refpdb (Chain, Selection, or AtomGroup) – Reference structure. If set to None, it will be set to ensemble.getAtoms() automatically.
  • PDBs (iterable) – A list of PDB structures
  • title (str) – The title of the ensemble
  • labels (list) – labels of the conformations
  • seqid (int) – Minimal sequence identity (percent)
  • coverage (int) – Minimal sequence overlap (percent)
  • occupancy – Minimal occupancy of columns (range from 0 to 1). Columns whose occupancy

is below this value will be trimmed. :type occupancy: float :arg unmapped: A list of PDB IDs that cannot be included in the ensemble. This is an output argument. :type unmapped: list