Gene Ontology Annotation (GOA) Server Functions¶
This module defines functions for interfacing with the EBI’s Gene Ontology Annotation (GOA) database for analysing gene/protein functions through the Gene Ontology (GO).
This module is based on the tutorial notebook at https://nbviewer.jupyter.org/urls/dessimozlab.github.io/go-handbook/GO%20Tutorial%20in%20Python%20-%20Solutions.ipynb
-
class
GOADictList
(parsingList, title='unnamed', **kwargs)[source]¶ A class for handling the list of GOA Dictionaries returned by queryGOA
-
parseOBO
(**kwargs)[source]¶ Parse a GO OBO file containing the GO itself. See OBO for more information on the file format.
-
parseGAF
(database='PDB', **kwargs)[source]¶ Parse a GO Association File (GAF) corresponding to a particular database collection into a dictionary for ease of querying.
See GAF for more information on the file format
Parameters:
-
queryGOA
(*ids, **kwargs)[source]¶ Query a GOA database by identifier.
Parameters: - ids (str, tuple, list,
ndarray
) – an identifier or a list-like of identifiers - database (str) – name of the database of interest default is PDB. Others include UNIPROT and common names of many organisms.
- ids (str, tuple, list,
-
showGoLineage
(go_term, **kwargs)[source]¶ Use pygraphviz and IPython notebook to show the lineage of a GO term
Parameters: go (~goatools.obo_parser.GODag) – object containing a gene ontology (GO) directed acyclic graph (DAG) default is to parse with parseOBO()
- arg out_format: format for output.
- Currently only output to file. This file will be displayed in Jupyter Notebook.
type out_format: str
- arg filename: filename for output
- default behaviour is to use the GO term ID and append ‘_lineage.png’
type filename: str
-
calcGoOverlap
(*go_terms, **kwargs)[source]¶ Calculate overlap between GO terms based on their distance in the graph. GO terms in different namespaces (molecular function, cellular component, and biological process) have undefined distances.
Parameters: - go_terms (list, tuple, ~numpy.ndarray) – a list of GO terms or GO IDs
- pairwise (bool) – whether to calculate to a matrix of pairwise overlaps default is False
- distance (bool) – whether to return distances rather than calculating overlaps default is False
- go (~goatools.obo_parser.GODag) – GO graph. Default behaviour is to parse it with
parseOBO()
.
-
calcDeepFunctionOverlaps
(*goa_data, **kwargs)[source]¶ Calculate function overlaps between the deep (most detailed) molecular functions in particular from two sets of GO terms.
Parameters: - goa1 (tuple, list,
ndarray
) – the first set of GO terms - goa2 (tuple, list,
ndarray
) – the second set of GO terms
- goa1 (tuple, list,
-
calcEnsembleFunctionOverlaps
(ens, **kwargs)[source]¶ Calculate function overlaps for an ensemble as the mean of the value from
calcDeepFunctionOverlaps()
.Parameters: ens ( Ensemble
) – an ensemble with labels
-
findDeepestFunctions
(go_terms, **kwargs)[source]¶ Find the deepest (most detailed) molecular functions in a list of GO terms.
Parameters: go_terms ( GOADictList
) – a list of GO terms
-
findDeepestCommonAncestor
(terms, go)[source]¶ Find the nearest common ancestor. Only returns single most specific - assumes unique exists.
Parameters: - terms (tuple, list,
ndarray
) – a list of GO terms - go (~goatools.obo_parser.GODag) – object containing a gene ontology (GO) directed acyclic graph (DAG)
- terms (tuple, list,
-
calcMinBranchLength
(go_id1, go_id2, go)[source]¶ Find the minimum branch length between two terms in the GO DAG.
Parameters: - go_id1 (str) – the first GO ID
- go_id2 – the second GO ID
:type go_id2:str
Parameters: go (~goatools.obo_parser.GODag) – object containing a gene ontology (GO) directed acyclic graph (DAG)