HiC

class HiC(title='Unknown', map=None, bin=None)[source]

This class is used to store and preprocess Hi-C contact map. A GNM instance for analyzing the contact map can be also created by using this class.

calcGNM(n_modes=None, **kwargs)[source]

Calculates GNM on the current Hi-C map. By default, n_modes is set to None and zeros to True.

getCompleteMap()[source]

Obtains the complete contact map with unmapped regions.

getDomainList()[source]

Returns a list of domain separations. The list has two columns: the first is for the domain starts and the second is for the domain ends.

getDomains()[source]

Returns an 1D numpy.ndarray whose length is the number of loci. Each element is an index denotes to which domain the locus belongs.

getKirchhoff()[source]

Builds a Kirchhoff matrix based on the contact map.

getTitle()[source]

Returns title of the instance.

getTrimedMap()[source]

Obtains the contact map without unmapped regions.

normalize(method=<function VCnorm>, **kwargs)[source]

Applies chosen normalization on the current Hi-C map.

setDomains(labels, **kwargs)[source]

Uses spectral clustering to identify structural domains on the chromosome.

Parameters:
  • labels (ndarray, list) – domain labels
  • method (func) – Label assignment algorithm used after Laplacian embedding.
setTitle(title)[source]

Sets title of the instance.

view(spec='p', **kwargs)[source]

Visualization of the Hi-C map and domains (if present). The function makes use of showMatrix().

Parameters:
  • spec (str) – a string specifies how to preprocess the matrix. Blank for no preprocessing, ‘p’ for showing only data from p-th to 100-p-th percentile. ‘_’ is to suppress creating a new figure and paint to the current one instead. The letter specifications can be applied sequentially, e.g. ‘p_‘.
  • p (double) – specifies the percentile threshold.
parseHiC(filename, **kwargs)[source]

Returns an HiC from a Hi-C data file.

This function extends parseHiCStream().

Parameters:filename (str) – the filename to the Hi-C data file.
parseHiCStream(stream, **kwargs)[source]

Returns an HiC from a stream of Hi-C data lines.

Parameters:stream – Anything that implements the method read, seek (e.g. file, buffer, stdin)
saveHiC(hic, filename=None, map=True, **kwargs)[source]

Saves HiC model data as filename.hic.npz. If map is True, Hi-C contact map will not be saved and it can be loaded from raw data file later. If filename is None, name of the Hi-C instance will be used as the filename, after " " (white spaces) in the name are replaced with "_" (underscores). Upon successful completion of saving, filename is returned. This function makes use of numpy.savez() function.

loadHiC(filename)[source]

Returns HiC instance after loading it from file (filename). This function makes use of numpy.load() function. See also saveHiC().

writeMap(filename, map, bin=None, format='%f')[source]

Writes map to the file designated by filename.

Parameters:
  • filename (str) – the file to be written.
  • map (numpy.ndarray) – a Hi-C contact map.
  • bin (int) – bin size of the map. If bin is None, map will be written in full matrix format.
  • format (str) – output format for map elements.