Project description:A metaproteomics analysis was conducted on the infant fecal microbiome to characterize global protein expression in 8 samples obtained from infants with a range of early-life experiences. Samples included breast-, formula- or mixed-fed, mode of delivery, and antibiotic treatment and one set of monozygotic twins. Although label-free mass spectrometry-based proteomics is routinely used for the identification and quantification of thousands of proteins in complex samples, the metaproteomic analysis of the gut microbiome presents particular technical challenges. Among them: the extreme complexity and dynamic range of member taxa/species, the need for matched, well-annotated metagenomics databases, and the high inter-protein sequence redundancy/similarity between related members. In this study, a metaproteomic approach was developed for assessment of the biological phenotype and functioning, as a complement to 16S rRNA sequencing analysis to identify constituent taxa. A sample preparation method was developed for recovery and lysis of bacterial cells, followed by trypsin digestion, and pre-fractionation using Strong Cation Exchange chromatography. Samples were then subjected to high performance LC-MS/MS. Data was searched against the Human Microbiome Project database, and a homology-based meta-clustering strategy was used to combine peptides from multiple species into representative proteins. Bacterial taxonomies were also identified, based on species-specific protein sequences, and protein metaclusters were assigned to pathways and functional groups. The results obtained demonstrate the applicability of this approach for performing qualitative comparisons of human fecal microbiome composition, physiology and metabolism, and also provided a more detailed assessment of microbial composition in comparison to 16S rRNA.
Project description:In this study we developed metaproteomics based methods for quantifying taxonomic composition of microbiomes (microbial communities). We also compared metaproteomics based quantification to other quantification methods, namely metagenomics and 16S rRNA gene amplicon sequencing. The metagenomic and 16S rRNA data can be found in the European Nucleotide Archive (Study number: PRJEB19901). For the method development and comparison of the methods we analyzed three types of mock communities with all three methods. The communities contain between 28 to 32 species and strains of bacteria, archaea, eukaryotes and bacteriophage. For each community type 4 biological replicate communities were generated. All four replicates were analyzed by 16S rRNA sequencing and metaproteomics. Three replicates of each community type were analyzed with metagenomics. The "C" type communities have same cell/phage particle number for all community members (C1 to C4). The "P" type communities have the same protein content for all community members (P1 to P4). The "U" (UNEVEN) type communities cover a large range of protein amounts and cell numbers (U1 to U4). We also generated proteomic data for four pure cultures to test the specificity of the protein inference method. This data is also included in this submission.
Project description:Purpose: To identify the genetic basis of posterior polymorphous corneal dystrophy 1 (PPCD1). Methods: Next-generation sequencing was performed on DNA samples from 4 affected and 4 unaffected members of a previously reported family with PPCD1 linked to chromosome 20 between D20S182 and D20S195. Custom capture probes were utilized for targeted region capture of the linked interval. Single nucleotide variants (SNVs) and insertions/deletions (indels) were identified using two bioinformatics pipelines and two annotation databases. Candidate variants met the following criteria: quality score ≥20, read depth ≥5X, heterozygous, novel or rare (minor allele frequency (MAF) ≤ 0.05), present in each affected individual and absent in each unaffected individual. Structural variants were detected with two different microarray platforms to identify indels of varying sizes. Results: Sequencing reads aligned to the linked region on chromosome 20, and high coverage was obtained across the sequenced region. The majority of identified variants were detected with both pipelines and annotation databases, although unique variants were identified. Twelve SNVs in 10 genes (2 synonymous variants and 10 noncoding variants) and 9 indels in 7 genes met the filtering criteria and were considered candidate variants for PPCD1. Conclusions: Next-generation sequencing of the PPCD1 interval has identified 17 genes containing novel or rare SNVs and indels that segregate with the affected phenotype in an affected family previously mapped to the PPCD1 locus. We anticipate that screening of these candidate genes in other families previously mapped to the PPCD1 locus will result in the identification of the genetic basis of PPCD1. Four affected and 4 unaffected individuals from a single family were analyzed for copy number variation within the PPCD1 disease locus. Array design and analysis is based on genome build hg19.
Project description:Understanding biological diversity and distribution patterns at multiple spatial scales is a central issue in ecology. Here, we investigated the biogeographical patterns of functional genes in soil microbes from 24 arctic heath sites using GeoChip-based metagenomics and principal coordinates of neighbour matrices (PCNM)-based analysis. Functional gene richness varied considerably among sites, while the proportions of each major functional gene category were evenly distributed. Functional gene composition varied significantly at most medium and broad spatial scales, and the PCNM analyses indicated that 14-20% of the variation in total and major functional gene categories could be attributed primarily to relatively broad-scale spatial effects that were consistent with broad-scale variation in soil pH and total nitrogen. The combination of variance partitioning and multi-scales analysis indicated that spatial distance effects contributed 12% to variation in functional gene composition，whereas environmental factors contributed only 3%. This relatively strong influence of spatial as compared to environmental variation in determining functional gene distributions contrasts sharply with typical microbial phylotype/species-based biogeographical patterns in the Arctic and elsewhere. Our results suggest that the distributions of soil functional genes cannot be predicted from phylogenetic distributions because spatial factors associated with historical contingencies are relatively important determinants of their biogeography. Overall design: 72 heath tundra surface soil samples across the Canadian, Alaskan and European Arctic including three vegetation type (Empetrum spp., Cassiope spp., Dryas spp.)
Project description:Understanding microbial community diversity is thought to be crucial for improving process functioning and stabilities of wastewater treatment systems. However, current studies largely focus on taxonomic groups based on 16S rRNA, which are not necessarily linked to functioning, or a few selected functional genes. Here we launched a study to profile the overall functional genes of microbial communities in three full-scale wastewater treatment systems. Triplicate activated sludge samples from each system were analyzed using a high-throughput metagenomics tool named GeoChip 4.2, resulting in the detection of 38,507 to 40,647 functional genes. A high similarity of 75.5% to 79.7% shared genes was noted among the nine samples. Moreover, correlation analyses showed that the abundances of a wide array of functional genes were associated with system performances. For example, the abundances of overall nitrogen cycling genes had a strong correlation to total nitrogen (TN) removal rates (r = 0.7647, P < 0.01). The abundances of overall carbon cycling genes were moderately correlated with COD removal rates (r = 0.6515, P < 0.01). Lastly, we found that influent chemical oxygen demand (COD inf) and total phosphorus concentrations (TP inf), and dissolved oxygen (DO) concentrations were key environmental factors shaping the overall functional genes. Together, the results revealed vast functional gene diversity and some links between the functional gene compositions and microbe-mediated processes. Overall design: Three full-scale wastewater treatment systems located in the same city were investigated. Triplicate samples were collected in each site.
Project description:Understanding and quantifying the effects of environmental factors influencing the variation of abundance and diversity of microbial communities was a key theme of ecology. For microbial communities, there were two factors proposed in explaining the variation in current theory, which were contemporary environmental heterogeneity and historical events. Here, we report a study to profile soil microbial structure, which infers functional roles of microbial communities, along the latitudinal gradient from the north to the south in China mainland, aiming to explore potential microbial responses to external condition, especially for global climate changes via a strategy of space-for-time substitution. Using a microarray-based metagenomics tool named GeoChip 5.0, we showed that microbial communities were distinct for most but not all of the sites. Using substantial statistical analyses, exploring the dominant factor in influencing the soil microbial communities along the latitudinal gradient. Substantial variations were apparent in nutrient cycling genes, but they were in line with the functional roles of these genes. 300 samples were collected from 30 sites along the latitudinal gradient, with 10 replicates in every site
Project description:Leaf-to-leaf, systemic immune signaling known as systemic acquired resistance (SAR) is poorly understood in monocotyledonous plants. Here, we characterize systemic immunity in barley (Hordeum vulgare) triggered after primary leaf infection with either Pseudomonas syringae pathovar japonica (Psj) or Xanthomonas translucens pathovar cerealis (Xtc). Both pathogens induced resistance in systemic, uninfected leaves against a subsequent challenge infection with Xtc. In contrast to SAR in Arabidopsis thaliana, systemic immunity in barley was not associated with NONEXPRESSOR OF PATHOGENESIS-RELATED GENES1 or the local or systemic accumulation of salicylic acid (SA). Instead, we documented a moderate local but not systemic induction of abscisic acid (ABA) after infection of leaves with Psj. In contrast to SA or its functional analog benzothiadiazole, local applications of the jasmonic acid methyl ester or ABA triggered systemic immunity to Xtc. RNA-seq analysis of local and systemic transcript accumulation revealed unique gene expression changes in response to both Psj and Xtc and a clear separation of local from systemic responses. The systemic response appeared relatively modest and quantitative RT-PCR associated systemic immunity with the local and systemic induction of two WRKY and two ETHYLENE RESPONSIVE FACTOR-like transcription factors. Systemic immunity against Xtc was further associated with transcriptional changes after a secondary/systemic Xtc challenge infection; these changes were dependent on the primary treatment. Taken together, bacteria-induced systemic immunity in barley may be mediated in part by WRKY and ERF-like transcription factors possibly facilitating transcriptional reprogramming to potentiate immunity.
Project description:The horse, like a majority of animal species, has a limited amount of species-specific expressed sequence data available in public databases. As a result, structural models for a majority of genes defined in the equine genome are predictions based on ab initio sequence analysis or the projection of gene structures from other mammalian species. The current study used Illumina-based sequencing of messenger RNA (RNA-seq) to help refine structural annotation of equine protein-coding genes and for a preliminary assessment of gene expression patterns. Sequencing of mRNA from eight equine tissues generated 293,758,105 thirty five-base sequence tags, equaling 10.28 giga-basepairs of total sequence data. The tag alignments represent approximately 208X coverage of the equine mRNA transcriptome and confirmed transcriptional activity for roughly 90% of the protein-coding gene structures predicted by Ensembl and NCBI. Tag coverage was sufficient to define structural annotation for 11,356 genes, while also identifying an additional 456 transcripts with exon/intron features that are not listed by either Ensembl or NCBI. Genomic locus data and intervals for the protein-coding genes predicted by the Ensembl and NCBI annotation pipelines were combined with 75,116 RNA-seq derived transcriptional units to generate a consensus equine protein-coding gene set of 20,302 defined loci. Gene ontology annotation was used to compare the functional and structural categories of genes expressed in either a tissue-restricted pattern or broadly across all tissue samples. Examination of 8 equine RNA samples representing 6 distinct tissues