Project description:Short interspersed nuclear elements (SINEs) are a widespread type of small transposable element (TE). With increasing evidence for their impact on gene function and genome evolution in plants, accurate genome-scale SINE annotation becomes a fundamental step for studying the regulatory roles of SINEs and their relationship with other components in the genomes. Despite the overall promising progress made in TE annotation, SINE annotation remains a major challenge. Unlike some other TEs, SINEs are short and heterogeneous, and they usually lack well-conserved sequence or structural features. Thus, current SINE annotation tools have either low sensitivity or high false discovery rates. Given the demand and challenges, we aimed to provide a more accurate and efficient SINE annotation tool for plant genomes. The pipeline starts with maximizing the pool of SINE candidates via profile hidden Markov model-based homology search and de novo SINE search using structural features. Then, it excludes the false positives by integrating all known features of SINEs and the features of other types of TEs that can often be misannotated as SINEs. As a result, the pipeline substantially improves the tradeoff between sensitivity and accuracy, with both values close to or over 90%. We tested our tool in Arabidopsis thaliana and rice (Oryza sativa), and the results show that our tool competes favorably against existing SINE annotation tools. The simplicity and effectiveness of this tool would potentially be useful for generating more accurate SINE annotations for other plant species. The pipeline is freely available at https://github.com/yangli557/AnnoSINE.
Project description:BACKGROUND:Due to the current poaching crisis in Africa, increasing numbers of white rhinoceroses (Ceratotherium simum) require opioid immobilisation for medical interventions or management procedures. Alarmingly, the results of both blood gas analysis and pulse oximetry regularly indicate severe hypoxaemia. Yet, the recovery of the animals is uneventful. Thus, neither of the techniques seems to represent the real oxygenation level. We hypothesized that unusual haemoglobin characteristics of this species interfere with the techniques developed and calibrated for the use in human patients. METHODS:Haemoglobin was isolated from blood samples of four adult, white rhinoceroses. Oxygen dissociation curves at pH 7.2 and 7.4 (37°C) were determined based on the absorbance change of haemoglobin in the Soret-region (around 420 nm). Absorbance spectra of oxy- and deoxyhaemoglobin extending into the infrared region were measured. RESULTS:Oxygen dissociation curves of rhinoceros haemoglobin showed the typical high oxygen affinity (p50 of 2.75 ± 0.07 and 2.00 ± 0.04 kPa for pH 7.2 and 7.4, respectively) under near-physiological conditions with respect to pH, temperature and DPG. The infrared absorbance spectra of oxy- and deoxyhaemoglobin showed only marginal deviations from standard human spectra, possibly due to the presence of a few percent of methaemoglobin in vitro. CONCLUSIONS:Our data enables the development of a rhinoceros-specific blood gas analysis algorithm, which allows for species-specific calculation of SaO2 levels in anaesthetized animals. The inconspicuous absorbance spectra do not contribute to the systematic underestimation of SpO2 by pulse-oximetry.
Project description:A 65-bp "core" sequence is dispersed in hundreds of thousands copies in the human genome. This sequence was found to constitute the central segment of a group of short interspersed elements (SINEs), referred to as mammalian-wide interspersed repeats, that proliferated before the radiation of placental mammals. Here, we propose that the core identifies an ancient tRNA-like SINE element, which survived in different lineages such as mammals, reptiles, birds, and fish, as well as mollusks, presumably for >550 million years. This element gave rise to a number of sequence families (CORE-SINEs), including mammalian-wide interspersed repeats, whose distinct 3' ends are shared with different families of long interspersed elements (LINEs). The evolutionary success of the generic CORE-SINE element can be related to the recruitment of the internal promoter from highly transcribed host RNA as well as to its capacity to adapt to changing retropositional opportunities by sequence exchange with actively amplifying LINEs. It reinforces the notion that the very existence of SINEs depends on the cohabitation with both LINEs and the host genome.
Project description:SINEs are retrotransposons that have enjoyed remarkable reproductive success during the course of mammalian evolution, and have played a major role in shaping mammalian genomes. Previously, an analysis of survey-sequence data from an individual dog (a poodle) indicated that canine genomes harbor a high frequency of alleles that differ only by the absence or presence of a SINEC_Cf repeat. Comparison of this survey-sequence data with a draft genome sequence of a distinct dog (a boxer) has confirmed this prediction, and revealed the chromosomal coordinates for >10,000 loci that are bimorphic for SINEC_Cf insertions. Analysis of SINE insertion sites from the genomes of nine additional dogs indicates that 3%-5% are absent from either the poodle or boxer genome sequences--suggesting that an additional 10,000 bimorphic loci could be readily identified in the general dog population. We describe a methodology that can be used to identify these loci, and could be adapted to exploit these bimorphic loci for genotyping purposes. Approximately half of all annotated canine genes contain SINEC_Cf repeats, and these elements are occasionally transcribed. When transcribed in the antisense orientation, they provide splice acceptor sites that can result in incorporation of novel exons. The high frequency of bimorphic SINE insertions in the dog population is predicted to provide numerous examples of allele-specific transcription patterns that will be valuable for the study of differential gene expression among multiple dog breeds.
Project description:DNA is the first SINE isolated from zebrafish (Danio rerio) exhibiting all the hallmarks of these tRNA-derived elements. DANA is unique in its clearly defined substructure of distinct cassettes. In contrast to generic SINE elements, DANA appears to have been assembled by insertions of short sequences into a progenitor, tRNA-derived element. Once associated with each other, these subunits were amplified as a new transposable element with such a remarkable success that DANA-related sequences comprise approximately 10% of the modern zebrafish genome. At least some of the sequences comprised by the full-length element were capable of movement, forming a new group of mobile, composite transposons, one of which caused an insertional mutation in the zebrafish no tail gene. Being present only in the genus Danio, and estimated to be as old as the genus itself, DANA may have played a role in Danio speciation by massive amplification and genome-wide dispersion. There are extensive DNA polymorphisms between zebrafish populations and strains detected by PCR amplification using primers specific to DANA, suggesting that the DANA element will be useful as a molecular tool for genetic and phylogenetic analyses.
Project description:The genomes of chum salmon and pink salmon contain a family of short interspersed repetitive elements (SINEs), designated the salmon SmaI family. It is restricted to these two species, a distribution that suggests that this SINE family might have been generated in their common ancestor. When insertions of the SmaI SINEs at 10 orthologous loci of these species were analyzed, however, it was found that there were no shared insertion sites between chum and pink salmon. Furthermore, at six loci where SmaI SINEs have been species-specifically inserted in chum salmon, insertions of SINEs were polymorphic among populations of chum salmon. By contrast, at four loci where SmaI SINEs had been species-specifically inserted in pink salmon, the SINEs were fixed among all populations of pink salmon. The interspecific and intraspecific variation of the SmaI SINEs cannot be explained by the assumption that the SmaI family was amplified in a common ancestor of these two species. To interpret these observations, we propose several possible models, including introgression and the horizontal transfer of SINEs from pink salmon to chum salmon during evolution.
Project description:Instances of highly conserved plant short interspersed nuclear element (SINE) families and their enrichment near genes have been well documented, but little is known about the general patterns of such conservation and enrichment and underlying mechanisms. Here, we perform a comprehensive investigation of the structure, distribution, and evolution of SINEs in the grass family by analyzing 14 grass and 5 other flowering plant genomes using comparative genomics methods. We identify 61 SINE families composed of 29,572 copies, in which 46 families are first described. We find that comparing with other grass TEs, grass SINEs show much higher level of conservation in terms of genomic retention: The origin of at least 26% families can be traced to early grass diversification and these families are among most abundant SINE families in 86% species. We find that these families show much higher level of enrichment near protein coding genes than families of relatively recent origin (51%:28%), and that 40% of all grass SINEs are near gene and the percentage is higher than other types of grass TEs. The pattern of enrichment suggests that differential removal of SINE copies in gene-poor regions plays an important role in shaping the genomic distribution of these elements. We also identify a sequence motif located at 3' SINE end which is shared in 17 families. In short, this study provides insights into structure and evolution of SINEs in the grass family.
Project description:In recent decades, experimental data has accumulated indicating that short interspersed nuclear elements (SINEs) can play a significant functional role in the regulation of gene expression in the host genome. In addition, molecular markers based on SINE insertion polymorphisms have been developed and are widely used for genetic differentiation of populations of eukaryotic organisms. Using routine bioinformatics analysis and publicly available genomic DNA and small RNA-seq data, we first described nine SINEs in the genome of the German cockroach, Blattella germanica. All described SINEs have tRNA promoters, and the start of their transcription begins 11 bp upstream of an "A" box of these promoters. The number of copies of the described SINEs in the B. germanica genome ranges from several copies to more than a thousand copies in a SINE-specific manner. Some of the described SINEs and their degenerate copies can be localized both in the introns of genes and loci known as piRNA clusters. piRNAs originating from piRNA clusters are shown to be mapped to seven of the nine types of SINEs described, including copies of SINEs localized in gene introns. We speculate that SINEs, localized in the introns of certain genes, may regulate the level of expression of these genes by a PIWI-related molecular mechanism.
Project description:To test whether regions undergoing genomic imprinting have unique genomic characteristics, imprinted and nonimprinted human loci were compared for nucleotide and retroelement composition. Maternally and paternally expressed subgroups of imprinted genes were found to differ in terms of guanine and cytosine, CpG, and retroelement content, indicating a segregation into distinct genomic compartments. Imprinted regions have been normally permissive to L1 long interspersed transposable element retroposition during mammalian evolution but universally and significantly lack short interspersed transposable elements (SINEs). The primate-specific Alu SINEs, as well as the more ancient mammalian-wide interspersed repeat SINEs, are found at significantly low densities in imprinted regions. The latter paleogenomic signature indicates that the sequence characteristics of currently imprinted regions existed before the mammalian radiation. Transitions from imprinted to nonimprinted genomic regions in cis are characterized by a sharp inflection in SINE content, demonstrating that this genomic characteristic can help predict the presence and extent of regions undergoing imprinting. During primate evolution, SINE accumulation in imprinted regions occurred at a decreased rate compared with control loci. The constraint on SINE accumulation in imprinted regions may be mediated by an active selection process. This selection could be because of SINEs attracting and spreading methylation, as has been found at other loci. Methylation-induced silencing could lead to deleterious consequences at imprinted loci, where inactivation of one allele is already established, and expression is often essential for embryonic growth and survival.
Project description:Short interspersed repetitive elements (SINEs) are a type of retroposon, being members of a class of informational molecules that are amplified via cDNA intermediates and flow back into the host genome. In contrast to retroviruses and retrotransposons, SINEs do not encode the enzymes required for their amplification, such as reverse transcriptases, so they are presumed to borrow these enzymes from other sources. In the present study, we isolated a family of long interspersed repetitive elements (LINEs) from the turtle genome. The sequence of this family was found to be very similar to those of the avian CR1 family. To our surprise, the sequence at the 3' end of the LINE in the turtle genome was nearly identical to that of a family of tortoise SINEs. Since CR1-like LINEs are widespread in birds and in many other reptiles, including the turtle, and since the tortoise SINEs are only found in vertical-necked turtles, it seems possible that the sequence at the 3' end of the tortoise SINEs might have been generated by recombination with the CR1-like LINE in a common ancestor of vertical-necked turtles, after the divergence of side-necked turtles. We extended our observations to show that the 3'-end sequences of families of several tRNA-derived SINEs, such as the salmonid HpaI family, the tobacco TS family, and the salmon SmaI family, might have originated from the respective LINEs. Since it appears reasonable that the recognition sites of LINEs for reverse transcriptase are located within their 3'-end sequences, these results provide the basis for a general scheme for the mechanism by which SINEs might acquire retropositional activity. We propose here that tRNA-derived SINEs might have been generated by a recombination event in which a strong-stop DNA with a primer tRNA, which is an intermediate in the replication of certain retroviruses and long terminal repeat retrotransposons, was directly integrated at the 3' end of a LINE.