GarNet 0.5.0

GarNet is an epigenomics data exploration tool. For more information about the scientific uses of GarNet, please see the Readme. To report an error, please refer to the Issues.

Some important notes:

  • In order to map peaks, a genome annotation file is required, processed into a “GarNet File”. There is such a file for hg19, mm9, and mm10.
garnet.construct_garnet_file(reference_file, motif_file_or_files, output_file, options={})[source]

Construct the GarNet file by searching for any motifs that are within a certain window of a reference TSS. Generate a dataframe with these assocations, as well as the distance between the motif and gene TSS. Write the dataframe to the specified output file.

Parameters:
  • reference_file (str) – path to the reference gene BED file
  • motifs_file (str) – path to the motifs BED file
  • output_file (str) – ouput GarNet file path
  • options (dict) – currently not used
Returns:

motif-gene associations and distance between the two.

Return type:

pd.DataFrame

garnet.map_peaks(peaks_filepath_or_list_of_peaks_filepaths, garnet_filepath)[source]

Find motifs and associated genes local to peaks.

This function intersects peaks from an epigenomics dataset with TF motifs.

Parameters:
  • peaks_filepath_or_list_of_peaks_filepaths (str or list) – filepath of the peaks file, or list of such paths
  • garnet_filepath (str) – filepath to the garnet file.
Returns:

a dataframe with rows of transcription factor binding motifs and nearby genes with the restriction that these motifs and genes must have been found near a peak.

Return type:

pd.DataFrame

garnet.TF_regression(motifs_and_genes_file_or_dataframe, expression_file, output_dir=None)[source]

Do linear regression of the expression of genes versus the strength of the assiciated transcription factor binding motifs and report results.

This function parses an expression file of two columns: gene symbol and expression value, and merges the expression profile into the motifs and genes file, resulting in information about transcription factor binding motifs local to genes, and those genes’ expressions. We do linear regression, and if an output directory is provided, we output a plot for each TF and an html summary of the regressions.

Parameters:
  • motifs_and_genes_file_or_dataframe (str or dataframe) – the outcome of map_known_genes_and_motifs_to_peaks, either as a dataframe or a file
  • expression_file (str or FILE) – a tsv file of expression data, with geneName, score columns
  • output_dir – (str): If you would like to output figures and a summary html page, supply an output directory
Returns:

slope, pval, and gene targets for each transcription factor.

Return type:

pd.dataframe