Dataset Information


Positional homology in bacterial genomes.

ABSTRACT: In comparative genomic studies, syntenic groups of homologous sequence in the same order have been used as supplementary information that can be used in helping to determine the orthology of the compared sequences. The assumption is that orthologous gene copies are more likely to share the same genome positions and share the same gene neighbors. In this study we have defined positional homologs as those that also have homologous neighboring genes and we investigated the usefulness of this distinction for bacterial comparative genomics. We considered the identification of positionaly homologous gene pairs in bacterial genomes using protein and DNA sequence level alignments and found that the positional homologs had on average relatively lower rates of substitution at the DNA level (synonymous substitutions) than duplicate homologs in different genomic locations, regardless of the level of protein sequence divergence (measured with non-synonymous substitution rate). Since gene order conservation can indicate accuracy of orthology assignments, we also considered the effect of imposing certain alignment quality requirements on the sensitivity and specificity of identification of protein pairs by BLAST and FASTA when neighboring information is not available and in comparisons where gene order is not conserved. We found that the addition of a stringency filter based on the second best hits was an efficient way to remove dubious ortholog identifications in BLAST and FASTA analyses. Gene order conservation and DNA sequence homology are useful to consider in comparative genomic studies as they may indicate different orthology assignments than protein sequence homology alone.


PROVIDER: S-EPMC2674667 | BioStudies | 2007-01-01

REPOSITORIES: biostudies

Similar Datasets

1000-01-01 | S-EPMC3334569 | BioStudies
1000-01-01 | S-EPMC2590594 | BioStudies
2014-01-01 | S-EPMC4138177 | BioStudies
2007-01-01 | S-EPMC1933189 | BioStudies
2007-01-01 | S-EPMC1810246 | BioStudies
2009-01-01 | S-EPMC2807883 | BioStudies
1000-01-01 | S-EPMC5013910 | BioStudies
2017-01-01 | S-EPMC5446766 | BioStudies
2000-01-01 | S-EPMC310950 | BioStudies
2013-01-01 | S-EPMC3834790 | BioStudies