Unknown

Dataset Information

0

MetageneCluster: a Python package for filtering conflicting signal trends in metagene plots.


ABSTRACT:

Background

Metagene plots provide a visualization of biological signal trends over subsections of the genome and are used to perform high-level analysis of experimental data by aggregating genome-level data to create an average profile. The generation of metagene plots is useful for summarizing the results of many sequencing-based applications. Despite their prevalence and utility, the standard metagene plot is blind to conflicting signals within data. If multiple distinct trends occur, they can interact destructively, creating a plot that does not accurately represent any of the underlying trends.

Results

We present MetageneCluster, a Python tool to generate a collection of representative metagene plots based on k-means clustering of genomic regions of interest. Clustering the data by similarity allows us to identify patterns within the features of interest. We are then able to summarize each pattern present in the data, rather than averaging across the entire feature space. We show that our method performs well when used to identify conflicting signals in real-world genome-level data.

Conclusions

Overall, MetageneCluster is a user-friendly tool for the creation of metagene plots that capture distinct patterns in underlying sequence data.

SUBMITTER: Carter C 

PROVIDER: S-EPMC10785526 | biostudies-literature | 2024 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

MetageneCluster: a Python package for filtering conflicting signal trends in metagene plots.

Carter Clayton C   Saporito Aaron A   Douglass Stephen M SM  

BMC bioinformatics 20240112 1


<h4>Background</h4>Metagene plots provide a visualization of biological signal trends over subsections of the genome and are used to perform high-level analysis of experimental data by aggregating genome-level data to create an average profile. The generation of metagene plots is useful for summarizing the results of many sequencing-based applications. Despite their prevalence and utility, the standard metagene plot is blind to conflicting signals within data. If multiple distinct trends occur,  ...[more]

Similar Datasets

| S-EPMC9692103 | biostudies-literature
| S-EPMC7214040 | biostudies-literature
| S-EPMC10085746 | biostudies-literature
| S-EPMC10385924 | biostudies-literature
| S-EPMC9810194 | biostudies-literature
| S-EPMC3765848 | biostudies-literature
| S-EPMC7597035 | biostudies-literature
| S-EPMC8138882 | biostudies-literature
| S-EPMC8275978 | biostudies-literature
| S-EPMC10997433 | biostudies-literature