MSA File¶
This module defines functions and classes for parsing, manipulating, and analyzing multiple sequence alignments.
-
class
MSAFile
(msa, mode='r', format=None, aligned=True, **kwargs)[source]¶ Handle MSA files in FASTA, SELEX, CLUSTAL and Stockholm formats.
msa may be a filename or a stream. Multiple sequence alignments can be read from or written in FASTA (
.fasta
), Stockholm (.sth
), CLUSTAL (.aln
), or SELEX (.slx
) format. For specified extensions, format argument is not needed. If aligned is True, unaligned sequences in the file or stream will cause anIOError
exception. filter, a function that returns a boolean, can be used for filtering sequences, seeMSAFile.setFilter()
for details. slice can be used to slice sequences, and is applied after filtering, seeMSAFile.setSlice()
for details.-
setFilter
(filter, filter_full=False)[source]¶ Set function used for filtering sequences. filter will be applied to split sequence label, by default. If filter_full is True, filter will be applied to the full label.
-
setSlice
(slice)[source]¶ Set object used to slice sequences, which may be a
slice()
or alist()
of numbers.
-
closed
¶ True for closed file.
-
format
¶ Format of the MSA file.
-
-
splitSeqLabel
(label)[source]¶ Returns label, starting residue number, and ending residue number parsed from sequence label.
-
parseMSA
(filename, **kwargs)[source]¶ Returns an
MSA
instance that stores multiple sequence alignment and sequence labels parsed from Stockholm, SELEX, CLUSTAL, PIR, or FASTA format filename file, which may be a compressed file. Uncompressed MSA files are parsed using C code at a fraction of the time it would take to parse compressed files in Python.
-
writeMSA
(filename, msa, **kwargs)[source]¶ Returns filename containing msa, a
MSA
orMSAFile
instance, in the specified format, which can be SELEX, Stockholm, or FASTA. If compressed is True or filename ends with.gz
, a compressed file will be written.MSA
instances will be written using C function into uncompressed files.Can also write CLUSTAL or PIR format files using Python functions.