Project description:Integrating two chromatographies in a co-fractionation mass spectrometry approach improved protein-metabolite interactions mapping capabilities and revealed novel protein-ligand pairings in Escherichia coli.
Project description:Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein-lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP-lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP-lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.
Project description:This is a collection of diverse publicly available macrophage datasets. The datasets belong to three different platforms and all normalized together using modified CDF files.
Project description:This project stores the search results of MaxQuant2.0.3.0 of 162 HCC cell line raw files and 48 raw files for spectral library generation, and search results of DIA-NN version 1.8 and Spectronaut version15.2 of HepG2, HCCLM3 and TGFB1-stimulated HCCLM3.
Project description:Estrogen receptor (ER) dimerization is prerequisite for its activation of target gene transcription. Because the two forms of ER, ERalpha and ERbeta, exhibit opposing functions in cell proliferation, the ability of ligands to induce ERalpha/beta heterodimers vs. their respective homodimers is expected to have profound impacts on transcriptional outcomes and cellular growth. However, there is a lack of direct methods to monitor the formation of ERalpha/beta heterodimers in vivo and to distinguish the ability of estrogenic ligands to promote ER homo- vs. heterodimerization. Here, we describe bioluminescence resonance energy transfer (BRET) assays for monitoring the formation of ERalpha/beta heterodimers and their respective homodimers in live cells. We demonstrate that although both partners contribute to heterodimerization, ligand-bound ERalpha plays a dominant role. Furthermore, a bioactive component was found to induce ERbeta/beta homodimers, and ERalpha/beta heterodimers but had minimal activity on ERalpha/alpha homodimers, posing a model that compounds promoting ERalpha/beta heterodimer formation might have therapeutic value. Thus, ER homodimer and heterodimer BRET assays are applicable to drug screening for dimer-selective selective ER modulators. Furthermore, this strategy can be used to study other nuclear receptor dimers.
Project description:Total RNA was extracted from V. angularium susceptible and resistant rainbow trout, tissues (liver, spleen, gill), in different time points using GenEluteTM mammalian RNA kit (RTN350, Sigma-Aldrich, Denmark). After measuring quantity (NanoDrop 2000 spectrophotometer (Saveen & Werner, Denmark)) and quality (gel electrophoresis) of RNA, cDNA was synthetised in T100 thermocycler, Biorad, Denmark, using Oligo d(T)16 primer and TaqMan® Reverse Transcription Reagents (cat.no. N8080234, Thermo Fischer Scientific, Denmark). Primers and probes for total of 28 genes including three housekeeping genes were synthesized at TAG Copenhagen AS, Denmark. qPCR reactions were run by Brilliant III Ultra-Fast QPCR Master Mix (600881, AH Diagnostics AS, Denmark) for all samples. The fold changes analysed by the simplified 2-ΔΔCq method. Fingerlings of rainbow trout (mean body weight of 12 g) were exposed (2 h bathing, 18°C) to the pathogen V. anguillarum serotype O1 in a solution of 1.5x107 cfu/ml and observed for 14 d. Disease signs appeared three days post exposure (dpe) whereafter morbidity progressed exponentially until 6 dpe reaching a total morbidity/mortality of 55% within 11 days. we sampled fish for immune gene expression analysis when they first showed clinical signs, fish without clinical signs at the same time point and finally fish surviving the exposure to the pathogen. The different immune gene expression profiles in the different groups were addressed when discussing possible resistance mechanisms in rainbow trout.
Project description:BackgroundWith the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS.ResultsWe introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data.ConclusionsWe propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").
Project description:Gene-regulatory enhancers have been identified using various approaches, including evolutionary conservation, regulatory protein binding, chromatin modifications, and DNA sequence motifs. To integrate these different approaches, we developed EnhancerFinder, a two-step method for distinguishing developmental enhancers from the genomic background and then predicting their tissue specificity. EnhancerFinder uses a multiple kernel learning approach to integrate DNA sequence motifs, evolutionary patterns, and diverse functional genomics datasets from a variety of cell types. In contrast with prediction approaches that define enhancers based on histone marks or p300 sites from a single cell line, we trained EnhancerFinder on hundreds of experimentally verified human developmental enhancers from the VISTA Enhancer Browser. We comprehensively evaluated EnhancerFinder using cross validation and found that our integrative method improves the identification of enhancers over approaches that consider a single type of data, such as sequence motifs, evolutionary conservation, or the binding of enhancer-associated proteins. We find that VISTA enhancers active in embryonic heart are easier to identify than enhancers active in several other embryonic tissues, likely due to their uniquely high GC content. We applied EnhancerFinder to the entire human genome and predicted 84,301 developmental enhancers and their tissue specificity. These predictions provide specific functional annotations for large amounts of human non-coding DNA, and are significantly enriched near genes with annotated roles in their predicted tissues and lead SNPs from genome-wide association studies. We demonstrate the utility of EnhancerFinder predictions through in vivo validation of novel embryonic gene regulatory enhancers from three developmental transcription factor loci. Our genome-wide developmental enhancer predictions are freely available as a UCSC Genome Browser track, which we hope will enable researchers to further investigate questions in developmental biology.
Project description:Predicting protein-ligand interactions using artificial intelligence (AI) models has attracted great interest in recent years. However, data-driven AI models unequivocally suffer from a lack of sufficiently large and unbiased datasets. Here, we systematically investigated the data biases on the PDBbind and DUD-E datasets. We examined the model performance of atomic convolutional neural network (ACNN) on the PDBbind core set and achieved a Pearson R2 of 0.73 between experimental and predicted binding affinities. Strikingly, the ACNN models did not require learning the essential protein-ligand interactions in complex structures and achieved similar performance even on datasets containing only ligand structures or only protein structures, while data splitting based on similarity clustering (protein sequence or ligand scaffold) significantly reduced the model performance. We also identified the property and topology biases in the DUD-E dataset which led to the artificially increased enrichment performance of virtual screening. The property bias in DUD-E was reduced by enforcing the more stringent ligand property matching rules, while the topology bias still exists due to the use of molecular fingerprint similarity as a decoy selection criterion. Therefore, we believe that sufficiently large and unbiased datasets are desirable for training robust AI models to accurately predict protein-ligand interactions.