Dataset Information

Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

ABSTRACT: Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and approximately 40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism.

SUBMITTER: Al-Swailem AM

PROVIDER: S-EPMC2873428 | biostudies-literature | 2010

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

Al-Swailem Abdulaziz M AM Shehata Maher M MM Abu-Duhier Faisel M FM Al-Yamani Essam J EJ Al-Busadah Khalid A KA Al-Arawi Mohammed S MS Al-Khider Ali Y AY Al-Muhaimeed Abdullah N AN Al-Qahtani Fahad H FH Manee Manee M MM Al-Shomrani Badr M BM Al-Qhtani Saad M SM Al-Harthi Amer S AS Akdemir Kadir C KC Inan Mehmet S MS Otu Hasan H HH

PloS one 20100519 5

Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homolog ...[more]

PMID: 20502665

Similar Datasets

Project description:Mice of the genus Peromyscus are found in nearly every habitat from Alaska to Central America and from the Atlantic to the Pacific. They provide an evolutionary outgroup to the Mus/Rattus lineage and serve as an intermediary between that lineage and humans. Although Peromyscus has been studied extensively under both field and laboratory conditions, research has been limited by the lack of molecular resources. Genes associated with reproduction typically evolve rapidly and thus are excellent sources of evolutionary information. In this study we describe the generation of two cDNA libraries, one from placenta and one from testis, characterize the resulting ESTs, and describe their utility for mapping the Peromyscus genome.The 5' ends of 1,510 placenta and 4,798 testis clones were sequenced. Low quality sequences were removed and after clustering and contig assembly, 904 unique placenta and 2,002 unique testis sequences remained. Average lengths of placenta and testis ESTs were 711 bp and 826 bp, respectively. Approximately 82% of all ESTs were identified using the BLASTX algorithm to Mus and Rattus, and 34 - 54% of all ESTs could be assigned to a biological process gene ontology category in either Mus or Rattus. Because the Peromyscus genome organization resembles the Rattus genome more closely than Mus we examined the distribution of the Peromyscus ESTs across the rat genome finding markers on all rat chromosomes except the Y. Approximately 40% of all ESTs were specific to only one location in the Mus genome and spanned introns of an appropriate size for sequencing and SNP detection. Of the primers that were tried 54% provided useful assays for genotyping on interspecific backcross and whole-genome radiation hybrid cell panels.The 2,906 Peromyscus placenta and testis ESTs described here significantly expands the molecular resources available for the genus. These ESTs allow for specific PCR amplification and broad coverage across the genome, creating an excellent genetic marker resource for the generation of a medium-density genomic map. Thus, this resource will significantly aid research of a genus that is uniquely well-suited to both laboratory and field research.

Project description:BackgroundSheep scab is caused by Psoroptes ovis and is arguably the most important ectoparasitic disease affecting sheep in the UK. The disease is highly contagious and causes and considerable pruritis and irritation and is therefore a major welfare concern. Current methods of treatment are unsustainable and in order to elucidate novel methods of disease control a more comprehensive understanding of the parasite is required. To date, no full genomic DNA sequence or large scale transcript datasets are available and prior to this study only 484 P. ovis expressed sequence tags (ESTs) were accessible in public databases.ResultsIn order to further expand upon the transcriptomic coverage of P. ovis thus facilitating novel insights into the mite biology we undertook a larger scale EST approach, incorporating newly generated and previously described P. ovis transcript data and representing the largest collection of P. ovis ESTs to date. We sequenced 1,574 ESTs and assembled these along with 484 previously generated P. ovis ESTs, which resulted in the identification of 1,545 unique P. ovis sequences. BLASTX searches identified 961 ESTs with significant hits (E-value < 1E-04) and 584 novel P. ovis ESTs. Gene Ontology (GO) analysis allowed the functional annotation of 880 ESTs and included predictions of signal peptide and transmembrane domains; allowing the identification of potential P. ovis excreted/secreted factors, and mapping of metabolic pathways.ConclusionsThis dataset currently represents the largest collection of P. ovis ESTs, all of which are publicly available in the GenBank EST database (dbEST) (accession numbers FR748230 - FR749648). Functional analysis of this dataset identified important homologues, including house dust mite allergens and tick salivary factors. These findings offer new insights into the underlying biology of P. ovis, facilitating further investigations into mite biology and the identification of novel methods of intervention.

Project description:BACKGROUND:Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. METHODS:We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. KEY RESULTS:Assembly of the 221,746 sequence reads resulted in 12,442 contigs. Over 50 % (6183) of 12,442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. CONCLUSIONS:This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing to better understanding the molecular mechanisms underlying ecological hypotheses that are important in successful plant invasions.

Dataset Information

Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

Publications

Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets