PDB Structure Ensemble

This module defines a class for handling ensembles of PDB conformations.

class PDBEnsemble(title='Unknown')[source]

This class enables handling coordinates for heterogeneous structural datasets and stores identifiers for individual conformations.

See usage usage in Heterogeneous X-ray Structures, Multimeric Structures, and Homologous Proteins.


This class is designed to handle conformations with missing coordinates, e.g. atoms that are note resolved in an X-ray structure. For unresolved atoms, the coordinates of the reference structure is assumed in RMSD calculations and superpositions.

addCoordset(coords, weights=None, label=None, **kwargs)[source]

Add coordinate set(s) to the ensemble. coords must be a Numpy array with suitable shape and dimensionality, or an object with getCoordsets(). weights is an optional argument. If provided, its length must match number of atoms. Weights of missing (not resolved) atoms must be 0 and weights of those that are resolved can be anything greater than 0. If not provided, weights of all atoms for this coordinate set will be set equal to 1. label, which may be a PDB identifier or a list of identifiers, is used to label conformations.


Delete a coordinate set from the ensemble.


Returns associated/selected atoms.


Returns conformation at given index.


Returns a copy of reference coordinates for selected atoms.

getCoordsets(indices=None, selected=True)[source]

Returns a copy of coordinate set(s) at given indices for selected atoms. indices may be an integer, a list of integers or None. None returns all coordinate sets.


When there are atoms with weights equal to zero (0), their coordinates will be replaced with the coordinates of the ensemble reference coordinate set.


Returns deviations from reference coordinates for selected atoms. Conformations can be aligned using one of superpose() or iterpose() methods prior to calculating deviations.


Returns identifiers of the conformations in the ensemble.

getMSA(indices=None, selected=True)[source]

Returns an MSA of selected atoms.


Calculate and return mean square fluctuations (MSFs). Note that you might need to align the conformations using superpose() or iterpose() before calculating MSFs.


Calculate and return root mean square deviations (RMSDs). Note that you might need to align the conformations using superpose() or iterpose() before calculating RMSDs.

Parameters:pairwise (bool) – if True then it will return pairwise RMSDs as an n-by-n matrix. n is the number of conformations.

Returns root mean square fluctuations (RMSFs) for selected atoms. Conformations can be aligned using one of superpose() or iterpose() methods prior to RMSF calculation.


Returns title of the ensemble.


Returns a copy of weights of selected atoms.


Returns if a subset of atoms are selected.


Iterate over coordinate sets. A copy of each coordinate set for selected atoms is returned. Reference coordinates are not included.


Iteratively superpose the ensemble until convergence. Initially, all conformations are aligned with the reference coordinates. Then mean coordinates are calculated, and are set as the new reference coordinates. This is repeated until reference coordinates do not change. This is determined by the value of RMSD between the new and old reference coordinates. Note that at the end of the iterative procedure the reference coordinate set will be average of conformations in the ensemble.

Parameters:rmsd (float) – change in reference coordinates to determine convergence, default is 0.0001 Å RMSD

Returns number of atoms.


Returns number of conformations.


Returns number of conformations.


Returns number of selected atoms. Number of all atoms will be returned if a selection is not made. A subset of atoms can be selected by passing a selection to setAtoms().


Set atoms or specify a selection of atoms to be considered in calculations and coordinate requests. When a selection is set, corresponding subset of coordinates will be considered in, for example, alignments and RMSD calculations. Setting atoms also allows some functions to access atomic data when needed. For example, Ensemble and Conformation instances become suitable arguments for writePDB(). Passing None as atoms argument will deselect atoms.


Set coords as the ensemble reference coordinate set. coords may be an array with suitable data type, shape, and dimensionality, or an object with getCoords() method.


Set title of the ensemble.


Set atomic weights.


Superpose the ensemble onto the reference coordinates.