Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.
ABSTRACT: To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma.Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map.A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified.This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Project description:Serial Analysis of Gene Expression (SAGE) is a powerful tool to determine gene expression profiles. Two types of SAGE libraries, ShortSAGE and LongSAGE, are classified based on the length of the SAGE tag (10 vs. 17 basepairs). LongSAGE libraries are thought to be more useful than ShortSAGE libraries, but their information content has not been widely compared. To dissect the differences between these two types of libraries, we utilized four libraries (two LongSAGE and two ShortSAGE libraries) generated from the hippocampus of Alzheimer and control samples. In addition, we generated two additional short SAGE libraries, the truncated long SAGE libraries (tSAGE), from LongSAGE libraries by deleting seven 5' basepairs from each LongSAGE tag.One problem that occurred in the SAGE study is that individual tags may have matched to multiple different genes - due to the short length of a tag. We found that the LongSAGE tag maps up to 15 UniGene clusters, while the ShortSAGE and tSAGE tags map up to 279 UniGene clusters. Both long and short SAGE libraries exhibit a large number of orphan tags (no gene information in UniGene), implying the limitation of the UniGene database. Among 100 orphan LongSAGE tags, the complete sequences (17 basepairs) of nine orphan tags match to 17 genomic sequences; four of the orphan tags match to a single genomic sequence. Our data show the potential to resolve 4-9% of orphan LongSAGE tags. Finally, among 400 tSAGE tags showing significant differential expression between AD and control, 79 tags (19.8%) were derived from multiple non-significant LongSAGE tags, implying the false positive results.Our data show that LongSAGE tags have high specificity in gene mapping compared to ShortSAGE tags. LongSAGE tags show an advantage over ShortSAGE in identifying novel genes by BLAST analysis. Most importantly, the chances of obtaining false positive results are higher for ShortSAGE than LongSAGE libraries due to their specificity in gene mapping. Therefore, it is recommended that the number of corresponding UniGene clusters (gene or ESTs) of a tag for prioritizing the significant results be considered.
Project description:In this study, we sequenced 691,390 SAGE tags from four libraries. Cervical L-SAGE libraries N1, N2, C1, and C2 were sequenced to 165,624, 181,224, 173,534, and 171,008 tags, respectively. Duplicate ditags were eliminated from analysis resulting in 136,276, 139,656, 154,828 and 136,386 useful tags respectively and a total of 24 058 unique tags. 15,438 of the unique tags mapped to annotated UniGene identifiers. We characterized the transcriptome of normal cervical tissue and evaluated the highly expressed genes in terms of tissue specificity, conserved expression among the normal libraries and their altered expression in CIN III lesions. Keywords: Cervical Epithelium, Long SAGE Four Long SAGE libraries were created from cervical epithelium biopsies. Two were CIN III and two were normal cervical tissue.
Project description:As a growing number of complementary transcripts, susceptible to exert various regulatory functions, are being found in eukaryotes, high throughput analytical methods are needed to investigate their expression in multiple biological samples. Serial Analysis of Gene Expression (SAGE), based on the enumeration of directionally reliable short cDNA sequences (tags), is capable of revealing antisense transcripts. We initially detected them by observing tags that mapped on to the reverse complement of known mRNAs. The presence of such tags in individual SAGE libraries suggested that SAGE datasets contain latent information on antisense transcripts. We raised a collection of virtual tags for mining these data. Tag pairs were assembled by searching for complementarities between 24-nt long sequences centered on the potential SAGE-anchoring sites of well-annotated human expressed sequences. An analysis of their presence in a large collection of published SAGE libraries revealed transcripts expressed at high levels from both strands of two adjacent, oppositely oriented, transcription units. In other cases, the respective transcripts of such cis-oriented genes displayed a mutually exclusive expression pattern or were co-expressed in a small number of libraries. Other tag pairs revealed overlapping transcripts of trans-encoded unique genes. Finally, we isolated a group of tags shared by multiple transcripts. Most of them mapped on to retroelements, essentially represented in humans by Alu sequences inserted in opposite orientations in the 3'UTR of otherwise different mRNAs. Registering these tags in separate files makes possible computational searches focused on unique sense-antisense pairs. The method developed in the present work shows that SAGE datasets constitute a major resource of rapidly investigating with high sensitivity the expression of antisense transcripts, so that a single tag may be detected in one library when screening a large number of biological samples.
Project description:Gene expression levels are regulated at many levels. Integration of genome-wide analyses for the study of DNA and RNA provides a unique tool to detect genetic alterations in the cancer genome. In this study, we generated and integrated DNA amplification data from comparative genomic hybridization (CGH) and serial analyses of gene expression (SAGE) in order to obtain a molecular profile of gastroesophageal junction (GEJ) carcinomas. DNA amplifications mapped to specific chromosomal regions and were frequently seen at 1q, 4q, 5q, 6p, 7p, 8q, 17q, and 20q. Using SAGE, we obtained over 156,432 tags from GEJ adenocarcinomas and normal gastric mucosa. These tags were assigned to UniGene clusters. Chromosomal positions for overexpressed genes were obtained to produce a GEJ carcinoma transcriptome map. A total of 123 genes was significantly overexpressed (more than fivefold; P <.01) in one or more SAGE libraries. This gene overexpression map was integrated and compared to the chromosomal CGH ideogram. Several chromosomal arms that had frequent DNA amplifications showed frequent gene expression alterations such as chromosomes 1 (15 genes), 2 (9 genes), 6 (6 genes), 11 (6 genes), 12 (8 genes), and 17 (13 genes). Despite the relatively large DNA amplification regions, overexpressed genes frequently mapped and clustered to small chromosomal regions at early-replicating (Giemsa light) bands such as 1q21.3 (nine genes), 6p21.3 (five genes), and 17q21 (eight genes). These results provide a comprehensive tool to search for DNA amplifications and overexpressed genes in GEJ carcinoma. The observed phenomenon of the presence of large amplification areas, yet clustering of overexpressed genes to relatively small loci, may suggest a high organization of chromatin and cancer-related genes in the nucleus.
Project description:Neural tube defects (NTDs) are common human birth defects with a complex etiology. To develop a comprehensive knowledge of the genes expressed during normal neurulation, we established transcriptomes from human neural tube fragments during and after neurulation using long Serial Analysis of Gene Expression (long-SAGE).Rostral and caudal neural tubes were dissected from normal human embryos aged between 26 and 32 days of gestation. Tissues from the same region and Carnegie stage were pooled (n ? 4) and total RNA extracted to construct four long-SAGE libraries. Tags were mapped using the UniGene Homo sapiens 17 bp tag-to-gene best mapping set. Differentially expressed genes were identified by chi-square or Fisher's exact test, and validation was performed for a subset of those transcripts using in situ hybridization. In silico analyses were performed with BinGO and EXPANDER.We observed most genes to be similarly regulated in rostral and caudal regions, but expression profiles differed during and after closure. In silico analysis found similar enrichments in both regions for biologic process terms, transcription factor binding and miRNA target motifs. Twelve genes potentially expressing alternate isoforms by region or developmental stage, and the microRNAs miR-339-5p, miR-141/200a, miR-23ab, and miR-129/129-5p are among several potential candidates identified here for future research.Time appears to influence gene expression in the developing central nervous system more than location. These data provide a novel complement to traditional strategies of identifying genes associated with human NTDs and offer unique insight into the genes associated with normal human neurulation.
Project description:The analysis of differentially expressed genes is a powerful approach to elucidate the genetic mechanisms underlying the morphological and evolutionary diversity among serially homologous structures, both within the same organism (e.g., hand vs. foot) and between different species (e.g., hand vs. wing). In the developing embryo, limb-specific expression of Pitx1, Tbx4, and Tbx5 regulates the determination of limb identity. However, numerous lines of evidence, including the fact that these three genes encode transcription factors, indicate that additional genes are involved in the Pitx1-Tbx hierarchy. To examine the molecular distinctions coded for by these factors, and to identify novel genes involved in the determination of limb identity, we have used Serial Analysis of Gene Expression (SAGE) to generate comprehensive gene expression profiles from intact, developing mouse forelimbs and hindlimbs. To minimize the extraction of erroneous SAGE tags from low-quality sequence data, we used a new algorithm to extract tags from -analyzed sequence data and obtained 68,406 and 68,450 SAGE tags from forelimb and hindlimb SAGE libraries, respectively. We also developed an improved method for determining the identity of SAGE tags that increases the specificity of and provides additional information about the confidence of the tag-UniGene cluster match. The most differentially expressed gene between our SAGE libraries was Pitx1. The differential expression of Tbx4, Tbx5, and several limb-specific Hox genes was also detected; however, their abundances in the SAGE libraries were low. Because numerous other tags were differentially expressed at this low level, we performed a 'virtual' subtraction with 362,344 tags from six additional nonlimb SAGE libraries to further refine this set of candidate genes. This subtraction reduced the number of candidate genes by 74%, yet preserved the previously identified regulators of limb identity. This study presents the gene expression complexity of the developing limb and identifies candidate genes involved in the regulation of limb identity. We propose that our computational tools and the overall strategy used here are broadly applicable to other SAGE-based studies in a variety of organisms. [SAGE data are all available at GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession nos. GSM55 and GSM56, which correspond to the forelimb and hindlimb raw SAGE data.]
Project description:To investigate the role of miR-29b on the changes in expression of genes involved in the synthesis and deposition of extracellular matrix in human trabecular meshwork cells (HTM). Overall design: One HTM cell line was transfected by triplicate with microRNA 29b or scramble control. After 3 days total RNA was extracted.
Project description:More than half of the approximately 500,000 women diagnosed with cervical cancer worldwide each year will die from this disease. Investigation of genes expressed in precancer lesions compared to those expressed in normal cervical epithelium will yield insight into the early stages of disease. As such, establishing a baseline from which to compare to, is critical in elucidating the abnormal biology of disease. In this study we examine the normal cervical tissue transcriptome and investigate the similarities and differences in relation to CIN III by Long-SAGE (L-SAGE).We have sequenced 691,390 tags from four L-SAGE libraries increasing the existing gene expression data on cervical tissue by 20 fold. One-hundred and eighteen unique tags were highly expressed in normal cervical tissue and 107 of them mapped to unique genes, most belong to the ribosomal, calcium-binding and keratinizing gene families. We assessed these genes for aberrant expression in CIN III and five genes showed altered expression. In addition, we have identified twelve unique HPV 16 SAGE tags in the CIN III libraries absent in the normal libraries.Establishing a baseline of gene expression in normal cervical tissue is key for identifying changes in cancer. We demonstrate the utility of this baseline data by identifying genes with aberrant expression in CIN III when compared to normal tissue.
Project description:To investigate the role of miR-29b on the changes in expression of genes involved in the synthesis and deposition of extracellular matrix in human trabecular meshwork cells (HTM). Experiment Overall Design: One HTM cell line was transfected by triplicate with microRNA 29b or scramble control. After 3 days total RNA was extracted.
Project description:Serial analysis of gene expression (SAGE) provides quantitative and comprehensive expression profiling in a given cell population. In our efforts to define gene expression alterations in Barrett's-related adenocarcinomas (BA), we produced eight SAGE libraries and obtained a total of 457,894 expressed tags with 32,035 (6.9%) accounting for singleton tags. The tumor samples produced an average of 71,804 tags per library, whereas normal samples produced an average of 42,669 tags per library. Our libraries contained 67,200 unique tags representing 16,040 known gene symbols. Five hundred and sixty-eight unique tags were differentially expressed between BAs and normal tissue samples (at least twofold; P<or=0.05), 395 of these matched to known genes. Interestingly, the distribution of altered genes was not uniform across the human genome. Overexpressed genes tended to cluster in well-defined hot spots located in certain chromosomes. For example, chromosome 19 had 26 overexpressed genes, of which 18 mapped to 19q13. Using the gene ontology approach for functional classification of genes, we identified several groups that are relevant to carcinogenesis. We validated the SAGE results of five representative genes (ANPEP, ECGF1, PP1201, EIF5A1, and GKN1) using quantitative real-time reverse-transcription PCR on 31 BA samples and 26 normal samples. In addition, we performed an immunohistochemistry analysis for ANPEP, which demonstrated overexpression of ANPEP in 6/7 (86%) Barrett's dysplasias and 35/65 (54%) BAs. ANPEP is a secreted protein that may have diagnostic and/or prognostic significance for Barrett's progression. The use of genomic approaches in this study provided useful information about the molecular pathobiology of BAs.