Unknown

Dataset Information

0

An integrative method for accurate comparative genome mapping.


ABSTRACT: We present MAGIC, an integrative and accurate method for comparative genome mapping. Our method consists of two phases: preprocessing for identifying "maximal similar segments," and mapping for clustering and classifying these segments. MAGIC's main novelty lies in its biologically intuitive clustering approach, which aims towards both calculating reorder-free segments and identifying orthologous segments. In the process, MAGIC efficiently handles ambiguities resulting from duplications that occurred before the speciation of the considered organisms from their most recent common ancestor. We demonstrate both MAGIC's robustness and scalability: the former is asserted with respect to its initial input and with respect to its parameters' values. The latter is asserted by applying MAGIC to distantly related organisms and to large genomes. We compare MAGIC to other comparative mapping methods and provide detailed analysis of the differences between them. Our improvements allow a comprehensive study of the diversity of genetic repertoires resulting from large-scale mutations, such as indels and duplications, including explicitly transposable and phagic elements. The strength of our method is demonstrated by detailed statistics computed for each type of these large-scale mutations. MAGIC enabled us to conduct a comprehensive analysis of the different forces shaping prokaryotic genomes from different clades, and to quantify the importance of novel gene content introduced by horizontal gene transfer relative to gene duplication in bacterial genome evolution. We use these results to investigate the breakpoint distribution in several prokaryotic genomes.

PROVIDER: S-EPMC1526463 | BioStudies |

REPOSITORIES: biostudies

Similar Datasets

| S-EPMC7025897 | BioStudies
| S-EPMC8175051 | BioStudies
| S-EPMC5714861 | BioStudies
| S-EPMC2791625 | BioStudies
| S-EPMC4174956 | BioStudies
| S-EPMC1434772 | BioStudies
| S-EPMC2897355 | BioStudies
| S-EPMC5903213 | BioStudies
| S-EPMC514399 | BioStudies
| S-EPMC3590777 | BioStudies