Unknown

Dataset Information

0

Limitations of the rhesus macaque draft genome assembly and annotation.


ABSTRACT: Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene model pipeline, Gnomon. The combination of draft genome with automated gene finding software can result in spurious sequences. We estimate that approximately 50% of the rhesus gene models are missing, incomplete or incorrect. The problems identified in this work likely apply to all draft vertebrate genomes annotated with any automated gene model pipeline and thus represent a pervasive challenge to the analysis of draft genomes.

SUBMITTER: Zhang X 

PROVIDER: S-EPMC3426473 | biostudies-literature | 2012 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Limitations of the rhesus macaque draft genome assembly and annotation.

Zhang Xiongfei X   Goodsell Joel J   Norgren Robert B RB  

BMC genomics 20120530


Finished genome sequences and assemblies are available for only a few vertebrates. Thus, investigators studying many species must rely on draft genomes. Using the rhesus macaque as an example, we document the effects of sequencing errors, gaps in sequence and misassemblies on one automated gene model pipeline, Gnomon. The combination of draft genome with automated gene finding software can result in spurious sequences. We estimate that approximately 50% of the rhesus gene models are missing, inc  ...[more]

Similar Datasets

| S-EPMC4214606 | biostudies-literature
| S-EPMC8717446 | biostudies-literature
| S-EPMC4578894 | biostudies-literature
| S-EPMC311045 | biostudies-literature
| S-EPMC2739830 | biostudies-literature
| S-EPMC6813658 | biostudies-literature
2020-05-04 | PXD018943 | Pride
| S-EPMC6141548 | biostudies-literature