Unknown

Dataset Information

0

A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks.


ABSTRACT: Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches have continuously gained interest in various sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, particularly for large-scale sequence datasets. Here, we construct a novel and simple mathematical descriptor based on the characterization of cis sequence complex DNA networks. This new approach is based on a code of three cis nucleotides in a gene that could code for an amino acid. In particular, for each DNA sequence, we will set up a cis sequence complex network that will be used to develop a characterization vector for the analysis of mitochondrial DNA sequence phylogenetic relationships among nine species. The resulting phylogenetic relationships among the nine species were determined to be in agreement with the actual situation.

SUBMITTER: Zhou J 

PROVIDER: S-EPMC5054945 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks.

Zhou Jie J   Zhong Pianyu P   Zhang Tinghui T  

Evolutionary bioinformatics online 20161006


Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches hav  ...[more]

Similar Datasets

| S-EPMC6403383 | biostudies-literature
| S-EPMC4410667 | biostudies-literature
| S-EPMC3384675 | biostudies-literature
| S-EPMC6355110 | biostudies-literature
| S-EPMC4427953 | biostudies-literature
| S-EPMC9436379 | biostudies-literature
| S-EPMC3799466 | biostudies-literature
| S-EPMC8000461 | biostudies-literature
| S-EPMC5870879 | biostudies-literature
| S-EPMC2808352 | biostudies-literature