Project description:BackgroundAlpha1,3-galactosyltransferase (alpha1,3GT) is an enzyme that produces carbohydrate chains termed alphaGal epitopes found in most mammals, although some species of higher primates, including human, are notable exceptions. The evolutionary origin of the lost alpha1,3GT enzyme activity is not yet known, although it has been suggested that the promoter activity of this gene in the ancestors of higher primates was inactivated.MethodsWe used 5'-or 3'-RACE, GenomeWalking, reverse transcriptase polymerase chain reaction (RT-PCR) and dual Luciferase reporter assay for identification of the full-length cDNA, which includes the transcription initiation site and the promoter region of porcine alpha1,3GT gene.ResultsThe region around exon 1 is guanine and cytosine (GC)-rich (about 70%), comprising a CpG island spanning more than 1.5 kbp. The 5'-flanking region of exon 1 contains multiple transcription factor consensus motifs, including GC-box, SP1, AP2, and GATA-box sites, in the absence of TATA or CAAT-box sequences. The entire gene consists of three 5' noncoding and six coding region exons spanning more than 52 kbp. Detailed analysis of alpha1,3GT transcripts revealed two major alternative splicing patterns in the 5'-untranslated region (5'-UTR) and evidence for minor splicing activity that occurs in a tissue-specific manner. Interspecies comparison of 5'-UTR shows minimal homology between porcine and murine sequences except for exon 2, which suggests that the regulatory regions differ among species.ConclusionsThese observations have important implications for experiments involving genetic manipulation of the alpha1,3GT gene in transgenic animals in terms of promoter utilization, and particularly in genetically engineering cells for the animal cloning technology by nuclear transfer.
Project description:Changes in chromatin structure, especially in histone modifications (HMs), linked with chromatin accessibility for transcription machinery, are considered to play significant roles in transcriptional regulation. Alveolar macrophages (AM) are important immune cells for protection against pulmonary pathogens, and must readily respond to bacteria and viruses that enter the airways. Mechanism(s) controlling AM innate response to different pathogen-associated molecular patterns (PAMPs) are not well defined in pigs. By combining RNA sequencing (RNA-seq) with chromatin immunoprecipitation and sequencing (ChIP-seq) for four histone marks (H3K4me3, H3K4me1, H3K27ac and H3K27me3), we established a chromatin state map for AM stimulated with two different PAMPs, lipopolysaccharide (LPS) and Poly(I:C), and investigated the potential effect of identified histone modifications on transcription factor binding motif (TFBM) prediction and RNA abundance changes in these AM. The integrative analysis suggests that the differential gene expression between non-stimulated and stimulated AM is significantly associated with changes in the H3K27ac level at active regulatory regions. Although global changes in chromatin states were minor after stimulation, we detected chromatin state changes for differentially expressed genes involved in the TLR4, TLR3 and RIG-I signaling pathways. We found that regions marked by H3K27ac genome-wide were enriched for TFBMs of TF that are involved in the inflammatory response. We further documented that TF whose expression was induced by these stimuli had TFBMs enriched within H3K27ac-marked regions whose chromatin state changed by these same stimuli. Given that the dramatic transcriptomic changes and minor chromatin state changes occurred in response to both stimuli, we conclude that regulatory elements (i.e. active promoters) that contain transcription factor binding motifs were already active/poised in AM for immediate inflammatory response to PAMPs. In summary, our data provides the first chromatin state map of porcine AM in response to bacterial and viral PAMPs, contributing to the Functional Annotation of Animal Genomes (FAANG) project, and demonstrates the role of HMs, especially H3K27ac, in regulating transcription in AM in response to LPS and Poly(I:C).
Project description:Gene regulation is a ubiquitous mechanism by which organisms respond to their environment. While organisms are often found to be adapted to the environments they experience, the role of gene regulation in environmental adaptation is not often known. In this study, we examine divergence in cis-regulatory effects between two Saccharomycesspecies, S. cerevisiaeand S. uvarum, that have substantially diverged in their thermal growth profile. We measured allele specific expression (ASE) in the species' hybrid at three temperatures, the highest of which is lethal to S. uvarumbut not the hybrid or S. cerevisiae. We find that S. uvarumalleles can be expressed at the same level as S. cerevisiaealleles at high temperature and most cis-acting differences in gene expression are not dependent on temperature. While a small set of 136 genes show temperature-dependent ASE, we find no indication that signatures of directional cis-regulatory evolution are associated with temperature. Within promoter regions we find binding sites enriched upstream of temperature responsive genes, but only weak correlations between binding site and expression divergence. Our results indicate that temperature divergence between S. cerevisiaeand S. uvarumhas not caused widespread divergence in cis-regulatory activity, but point to a small subset of genes where the species' alleles show differences in magnitude or opposite responses to temperature. The difficulty of explaining divergence in cis-regulatory sequences with models of transcription factor binding sites and nucleosome positioning highlights the importance of identifying mutations that underlie cis-regulatory divergence between species.
Project description:Conserved segments in DNA or protein sequences are strong candidates for functional elements and thus appropriate methods for computing them need to be developed and compared. We describe five methods and computer programs for finding highly conserved blocks within previously computed multiple alignments, primarily for DNA sequences. Two of the methods are already in common use; these are based on good column agreement and high information content. Three additional methods find blocks with minimal evolutionary change, blocks that differ in at most k positions per row from a known center sequence and blocks that differ in at most k positions per row from a center sequence that is unknown a priori. The center sequence in the latter two methods is a way to model potential binding sites for known or unknown proteins in DNA sequences. The efficacy of each method was evaluated by analysis of three extensively analyzed regulatory regions in mammalian beta-globin gene clusters and the control region of bacterial arabinose operons. Although all five methods have quite different theoretical underpinnings, they produce rather similar results on these data sets when their parameters are adjusted to best approximate the experimental data. The optimal parameters for the method based on information content varied little for different regulatory regions of the beta-globin gene cluster and hence may be extrapolated to many other regulatory regions. The programs based on maximum allowed mismatches per row have simple parameters whose values can be chosen a priori and thus they may be more useful than the other methods when calibration against known functional sites is not available.
Project description:Small mammal dispersal is strongly affected by geographical barriers. However, commensal small mammals may be passively transported over large distances and strong barriers by humans (often with agricultural products). This pattern should be especially apparent in topographically complex landscapes, such as mountain ranges, where valleys and/or peaks can limit dispersal of less vagile species. We predict that commensal species would have lower genetic differentiation and higher migration rates than related non-commensals in such landscapes. We contrasted population genetic differentiation in two sympatric Rattus species (R. satarae and R. rattus) in the Western Ghats mountains in southern India. We sampled rats from villages and adjacent forests in seven locations (20-640 km apart). Capture-based statistics confirmed that R. rattus is abundant in human settlements in this region, whereas R. satarae is non-commensal and found mostly in forests. Population structure analyses using ~970-bp mitochondrial control region and 17 microsatellite loci revealed higher differentiation for the non-commensal species (R. satarae F-statistics=0.420, 0.065, R. rattus F-statistics=0.195, 0.034; mitochondrial DNA, microsatellites, respectively). Genetic clustering analyses confirm that clusters in R. satarae are more distinct and less admixed than those in R. rattus. R. satarae shows higher slope for isolation-by-distance compared with R. rattus. Although mode of migration estimates do not strongly suggest higher rates in R. rattus than in R. satarae, they indicate that migration over long distances could still be higher in R. rattus. We suggest that association with humans could drive the observed pattern of differentiation in the commensal R. rattus, consequently impacting not only their dispersal abilities, but also their evolutionary trajectories.
Project description:BackgroundIncreasing evidence shows that whole genomes of eukaryotes are almost entirely transcribed into both protein coding genes and an enormous number of non-protein-coding RNAs (ncRNAs). Therefore, revealing the underlying regulatory mechanisms of transcripts becomes imperative. However, for a complete understanding of transcriptional regulatory mechanisms, we need to identify the regions in which they are found. We will call these transcriptional regulation regions, or TRRs, which can be considered functional regions containing a cluster of regulatory elements that cooperatively recruit transcriptional factors for binding and then regulating the expression of transcripts.ResultsWe constructed a hierarchical stochastic language (HSL) model for the identification of core TRRs in yeast based on regulatory cooperation among TRR elements. The HSL model trained based on yeast achieved comparable accuracy in predicting TRRs in other species, e.g., fruit fly, human, and rice, thus demonstrating the conservation of TRRs across species. The HSL model was also used to identify the TRRs of genes, such as p53 or OsALYL1, as well as microRNAs. In addition, the ENCODE regions were examined by HSL, and TRRs were found to pervasively locate in the genomes.ConclusionOur findings indicate that 1) the HSL model can be used to accurately predict core TRRs of transcripts across species and 2) identified core TRRs by HSL are proper candidates for the further scrutiny of specific regulatory elements and mechanisms. Meanwhile, the regulatory activity taking place in the abundant numbers of ncRNAs might account for the ubiquitous presence of TRRs across the genome. In addition, we also found that the TRRs of protein coding genes and ncRNAs are similar in structure, with the latter being more conserved than the former.
Project description:Porcine cells express endogenous retroviruses, some of which are infectious for human cells. To better understand the replication of these porcine endogenous retroviruses (PERVs) in cells of different types and animal species, we have performed studies of the long terminal repeat (LTR) region of known gammaretroviral isolates of PERV. Nucleotide sequence determination of the LTRs of PERV-NIH, PERV-C, PERV-A, and PERV-B revealed that the PERV-A and PERV-B LTRs are identical, whereas the PERV-NIH and PERV-C LTRs have significant sequence differences in the U3 region between each other and with the LTRs of PERV-A and PERV-B. Sequence analysis revealed a similar organization of basal promoter elements compared with other gammaretroviruses, including the presence of enhancer-like repeat elements. The sequences of the PERV-NIH and PERV-C repeat element are similar to that of the PERV-A and PERV-B element with some differences in the organization of these repeats. The sequence of the PERV enhancer-like repeat elements differs significantly from those of other known gammaretroviral enhancers. The transcriptional activities of the PERV-A, PERV-B, and PERV-C LTRs relative to each other were similar in different cell types of different animal species as determined by transient expression assays. On the other hand, the PERV-NIH LTR was considerably weaker in these cell types. The transcriptional activity of all PERV LTRs was considerably lower in porcine ST-IOWA cells than in cell lines from other species. Deletion mutant analysis of the LTR of a PERV-NIH isolate identified regions that transactivate or repress transcription depending on the cell type.
Project description:BackgroundAs a result of high-throughput genotyping methods, millions of human genetic variants have been reported in recent years. To efficiently identify those with significant biological functions, a practical strategy is to concentrate on variants located in important sequence regions such as gene regulatory regions.ResultsAnalysis of the most common type of variant, single nucleotide polymorphisms (SNPs), shows that in gene promoter regions more SNPs occur in close proximity to transcriptional start sites than in regions further upstream, and a disproportionate number of those SNPs represent nucleotide transversions. Additionally, the number of SNPs found in the predicted transcription factor binding sites is higher than in non-binding site sequences.ConclusionCurrent information about transcription factor binding site sequence patterns may not be exhaustive, and SNPs may be actively involved in influencing gene expression by affecting the transcription factor binding sites.
Project description:The comparison of gene regulatory networks between diseased versus healthy individuals or between two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis of gene expression regulatory networks between two conditions using single cell gene expression (scRNA-seq) and single cell chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used as a stand-alone package that provides joint clustering and embedding of the cells from both scRNA-seq and scATAC-seq, and the construction of differential regulatory networks across two conditions. We apply the method to compare the gene regulatory networks of an individual with chronic lymphocytic leukemia (CLL) versus a healthy control. The analysis reveals a tumor-specific B cell subpopulation in the CLL patient and identifies TOX2 as a potential regulator of this subpopulation.
Project description:Dishevelled (DVL) critically regulates Wnt signaling and contributes to a wide spectrum of diseases and is important in normal and pathophysiological settings. However, how it mediates diverse cellular functions remains poorly understood. Recent discoveries have revealed that constitutive Wnt pathway activation contributes to breast cancer malignancy, but the mechanisms by which this occurs are unknown and very few studies have examined the nuclear role of DVL. Here, we have performed DVL3 ChIP-seq analyses and identify novel target genes bound by DVL3. We show that DVL3 depletion alters KMT2D binding to novel targets and changes their epigenetic marks and mRNA levels. We further demonstrate that DVL3 inhibition leads to decreased tumor growth in two different breast cancer models in vivo. Our data uncover new DVL3 functions through its regulation of multiple genes involved in developmental biology, antigen presentation, metabolism, chromatin remodeling, and tumorigenesis. Overall, our study provides unique insight into the function of nuclear DVL, which helps to define its role in mediating aberrant Wnt signaling.