Release Notes

v2.0 series come with new and improved sequence, structure, and dynamics analysis features. See release notes for details.

How to Cite

Bakan A, Meireles LM, Bahar I ProDy: Protein Dynamics Inferred from Theory and Experiments
Bioinformatics 2011 27(11):1575-1577.

Bakan A, Dutta A, Mao W, Liu Y, Chennubhotla C, Lezon TR, Bahar I Evol and ProDy for Bridging Protein Sequence Evolution and Structural Dynamics
Bioinformatics 2014 30(18):2681-2683.

Pfam Access Functions¶

This module defines functions for interfacing Pfam database.

searchPfam(query, **kwargs)[source]¶

Returns Pfam search results in a dictionary. Matching Pfam accession as keys will map to evalue, alignment start and end residue positions.

Parameters:	query (str) – UniProt ID, PDB identifier, a protein sequence, or a sequence file. Sequence queries must not contain without gaps and must be at least 16 characters long timeout (int) – timeout for blocking connection attempt in seconds, default is 60

query can also be a PDB identifier, e.g. '1mkp' or '1mkpA' with chain identifier. UniProt ID of the specified chain, or the first protein chain will be used for searching the Pfam database.

fetchPfamMSA(acc, alignment='full', compressed=False, **kwargs)[source]¶

Returns a path to the downloaded Pfam MSA file.

Parameters:	acc (str) – Pfam ID or Accession Code alignment – alignment type, one of `'full'` (default), `'seed'`, `'ncbi'`, `'metagenomics'`, `'rp15'`, `'rp35'`, `'rp55'`, `'rp75'` or `'uniprot'` where rp stands for representative proteomes compressed – gzip the downloaded MSA file, default is False

Alignment Options

Parameters:	format – a Pfam supported MSA file format, one of `'selex'`, (default), `'stockholm'` or `'fasta'` order – ordering of sequences, `'tree'` (default) or `'alphabetical'` inserts – letter case for inserts, `'upper'` (default) or `'lower'` gaps – gap character, one of `'dashes'` (default), `'dots'`, `'mixed'` or None for unaligned

Other Options

Parameters:	timeout – timeout for blocking connection attempt in seconds, default is 60 outname – out filename, default is input `'acc_alignment.format'` folder – output folder, default is `'.'`

parsePfamPDBs(query, data=[], **kwargs)[source]¶

Returns a list of AtomGroup objects containing sections of chains that correspond to a particular PFAM domain family. These are defined by alignment start and end residue numbers.

Parameters:

Parameters:	query (str) – UniProt ID or PDB ID If a PDB ID is provided the corresponding UniProt ID is used. If this returns multiple matches then start or end must also be provided. This query is also used for label refinement of the Pfam domain MSA. data (list) – If given the data list from the Pfam mapping table will be output through this argument. start (int) – Residue number for defining the start of the domain. The PFAM domain that starts closest to this will be selected. Default is 1 end (int) – Residue number for defining the end of the domain. The PFAM domain that ends closest to this will be selected.

query (str) – UniProt ID or PDB ID If a PDB ID is provided the corresponding UniProt ID is used. If this returns multiple matches then start or end must also be provided. This query is also used for label refinement of the Pfam domain MSA.
data (list) – If given the data list from the Pfam mapping table will be output through this argument.
start (int) – Residue number for defining the start of the domain. The PFAM domain that starts closest to this will be selected. Default is 1
end (int) – Residue number for defining the end of the domain. The PFAM domain that ends closest to this will be selected.