PDB Blast Search¶
This module defines functions for blast searching the Protein Data Bank.
-
class
PDBBlastRecord
(xml=None, sequence=None, **kwargs)[source]¶ A class to store results from blast searches.
Instantiate a PDBBlastRecord object instance.
Parameters: -
fetch
(xml=None, sequence=None, **kwargs)[source]¶ Get Blast record from url or file.
Parameters: - sequence (
Atomic
,Sequence
, or str) – an object with an associated sequence string or a sequence string itself - xml (str) – blast search results in XML format or an XML file that contains the results or a filename for saving the results or None
- timeout (int) – amount of time until the query times out in seconds default value is 120
- sequence (
-
getBest
()[source]¶ Returns a dictionary containing structure and alignment information for the hit with highest sequence identity.
-
getHits
(percent_identity=0.0, percent_overlap=0.0, chain=False)[source]¶ Returns a dictionary in which PDB identifiers are mapped to structure and alignment information.
Parameters: - percent_identity (float) – PDB hits with percent sequence identity equal
to or higher than this value will be returned, default is
0.
- percent_overlap (float) – PDB hits with percent coverage of the query
sequence equivalent or better will be returned, default is
0.
- chain (bool) – if chain is True, individual chains in a PDB file will be considered as separate hits , default is False
- percent_identity (float) – PDB hits with percent sequence identity equal
to or higher than this value will be returned, default is
-
writeSequences
(filename, **kwargs)[source]¶ Returns a plot that contains a dendrogram of the sequence similarities among the sequences in given hit list.
Parameters: hits (dict) – A dictionary that contains hits that are obtained from a blast record object. Arguments of getHits can be parsed as kwargs.
-
-
blastPDB
(sequence, filename=None, **kwargs)[source]¶ Returns a
PDBBlastRecord
instance that contains results from blast searching sequence against the PDB using NCBI blastp.Parameters: - sequence (
Atomic
,Sequence
, or str) – an object with an associated sequence string or a sequence string itself - filename (str) – a filename to save the results in XML format
hitlist_size (default is
250
) and expect (default is1e-10
) search parameters can be adjusted by the user. sleep keyword argument (default is2
seconds) determines how long to wait to reconnect for results. Sleep time is multiplied by 1.5 when results are not ready. timeout (default is 120 s) determines when to give up waiting for the results.- sequence (