Metabolomics,Unknown,Transcriptomics,Genomics,Proteomics

Dataset Information

Transcription profiling of human Wilms tumor samples to predicting relapse in favorable histology cases

ABSTRACT: The gene expression patterns of favorable histology Wilms tumors (FHWT) that relapsed were compared with those that did not relapse using oligonucleotide arrays; Description: 250 FHWT of all stages enriched for relapses treated on National Wilms Tumor Study 5 passed quality parameters and were suitable for analysis using oligonucleotide arrays. Relapse risk stratification utilized Support Vector Machine; two and ten fold cross-validation was applied. The number of genes associated with relapse was less than that predicted by chance alone for 106 patients (32 relapses) with stages I and II FHWT and no further analyses were performed. This number was greater than expected by chance for 76 local stage III patients. Cross validation including an additional 68 local stage III patients (total 144 patients, 53 relapses) demonstrated that classifiers for relapse composed of 50 genes were associated with a median sensitivity of 47%, specificity 70%, and total error rate of 38%. Analysis of genes differentially expressed in relapse patients revealed apoptosis, Wnt signaling, IGF pathway, and epigenetic modification to be mechanisms important in relapse. Potential therapeutic targets include FRAP/MTOR and CD40. Experiment Overall Design: 144 stage 3 FHWT, with fifty-three relapses (cases) and ninety-one non-relapses (controls) with a minimum of three years follow-up, included all relapses and a 30% random selection of non-relapses for which frozen tumor tissue was available who passed all quality control parameters. The NWTS-5 protocol was approved by the review boards of institutions that registered patients. Histological diagnosis and local stage were confirmed by central review. Experiment Overall Design: Quality control steps taken: Experiment Overall Design: 1. Samples were snap frozen immediately following surgery and were mailed on dry ice to the Tumor Bank and retained at -80C. Experiment Overall Design: 2. Frozen sections were evaluated histologically and tumors with less than 80% viable tumor cellularity were excluded. Experiment Overall Design: 3. Array images were assessed by eye to confirm scanner alignment and the absence of significant bubbles or scratches. Experiment Overall Design: 4. Samples for which the 3'/5' ratios for GAPDH were greater than 3.2 were excluded. Experiment Overall Design: 5. The BioB spike controls were confirmed as present; BioC, BioD and cre were confirmed as increasing intensity. Experiment Overall Design: 6. When scaled to a target intensity of 2500, scaling factors were between 12 and 53; background levels were 34 â 115: Q values were 1.3 â 3.7 and mean intensities were within acceptable limits. Experiment Overall Design: 7. The range of percent present calls was from 38% to 52%. Experiment Overall Design: 8. Verification of gene expression was performed utilizing quantitative RT-PCR for five genes. Â Â Experiment Overall Design: Statistical Analysis: Positional-dependent-nearest-neighbor model (PDNN) software was used to translate the scanned images into expression analysis files and to normalize the data across all arrays (http://odin.mdacc.tmc.edu/~zhangli/PerfectMatch/). Genes with maximum expression less than a log scale of 6 across all tumors and Affymetrix control genes were excluded, resulting in 20,931 probe sets for analysis. Support Vector Machine (SVM) as developed by Chang and Lin (http://www.csie.ntu.edu.tw/~cjlin/libsvm) and implemented in an R software package, e1071 was chosen for relapse risk stratification using the p-value of the t-test comparison between case and control to select the genes. Using all 144 tumors, 109 genes were identified with p-value <0.001 and are provided in the data table labelled "t-test comparison between case and control" (ie, Supplemental table 2, in related publication)." Two and ten fold cross validations were utilized to investigate the ability of classifiers established in randomly selected training set to predict relapse in an independent test set comprised of the remaining tumors. For two-fold cross validation, the dataset was randomly divided 500 times into training and corresponding test sets of equal size, each including half the patients who relapsed. A classifier for relapse was identified for each training set and used to assign tumors in the corresponding test set to low and high risk categories. The training and test sets were then swapped. The number of top K genes in each classifier evaluated ranged from 1-150. Therefore, a total of 150,000 different classifiers were developed, one for each value of K from 1-150, for each of the 1,000 (500*2) training sets. For ten-fold cross validation the dataset was randomly divided 500 times into ten groups of approximately equal size. Each group included approximately the same number of relapses. For each such group, a classifier was built with the remaining 9/10 of the samples and then used to categorize tumors in the group as low or high risk; the process was repeated until all tumor samples were categorized as low or high risk. For all the cross validation procedures, to avoid gene-selection bias, classifiers were completely rebuilt in each cross validation iteration.

ORGANISM(S): Homo sapiens

SUBMITTER: Elizabeth Perlman

PROVIDER: E-GEOD-10320 | biostudies-arrayexpress |

REPOSITORIES: biostudies-arrayexpress

ACCESS DATA

Similar Datasets

Project description:Background: Histologic grade in breast cancer provides clinically important prognostic information. However, 30%-60% of tumors are classified as histologic grade 2. This grade is associated with an intermediate risk of recurrence and is thus not informative for clinical decision making. We examined whether histologic grade was associated with gene expression profi les of breast cancers and whether such profi les could be used to improve histologic grading. Methods: We analyzed microarray data from 189 invasive breast carcinomas and from three published gene expression datasets from breast carcinomas. We identified differentially expressed genes in a training set of 64 estrogen receptor (ER)-positive tumor samples by comparing expression profiles between histologic grade 3 tumors and histologic grade 1 tumors and used the expression of these genes to define the gene expression grade index. Data from 597 independent tumors were used to evaluate the association between relapse-free survival and the gene expression grade index in a Kaplan-Meier analysis. All statistical tests were two-sided. Results: We identified 97 genes in our training set that were associated with histologic grade; most of these genes were involved in cell cycle regulation and proliferation. In validation datasets, the gene expression grade index was strongly associated with histologic grade 1 and 3 status; however, among histologic grade 2 tumors, the index spanned the values for histologic grade 1-3 tumors. Among patients with histologic grade 2 tumors, a high gene expression grade index was associated with a higher risk of recurrence than a low gene expression grade index (hazard ratio = 3.61, 95% confidence interval = 2.25 to 5.78; P<.001, log-rank test). Conclusions: Gene expression grade index appeared to reclassify patients with histologic grade 2 tumors into two groups with high versus low risks of recurrence. This approach may improve the accuracy of tumor grading and thus its prognostic value. NB: The patients coming from Uppsala Hospital have been also used in other studies as in GSE3494. You can find the common set of patients in removing the abbreviation "UPP_" from the sample names and compare the results with the "INDEX (ID)" from the GSE3494 series. Experiment Overall Design: 64 microarray experiments from primary breast tumors used as training set to identify genes differentially expressed in grade 1 and 3. Experiment Overall Design: 129 microarray experiments from primary breast tumors of untreated patients used as validation set to validate the list of genes and its correlation with survival. Experiment Overall Design: No replicate, no reference sample. **NOTE** There are some inconsistencies between the sample annotation provided by GEO for this experiment in the GSE2990_family.soft.gz file and the supplementary data file GSE2990_suppl_info.txt. ***

Project description:Rationale: We previously generated genome-wide expression data in children with septic shock, based on whole blood-derive RNA, having the potential to lead the field into novel areas of investigation. Objective: Herein we seek to validate our data through a bioinformatic approach centered on a validation patient cohort. Methods: Microarray- and bioinformatics-centered analyses involving our previous data as a training data set (n = 42) and a new, validation cohort (n = 30) as the test data set. Measurements and Main Results: Class prediction modeling using the training data set and the previously reported genome-wide expression signature of pediatric septic shock correctly identified 93 to 100% of septic shock patients in the test data set, depending on the class prediction algorithm and the gene selection method. Subjecting the test data set to an identical filtering strategy as that used for the training data set, demonstrated 72% concordance between the two gene lists. Subjecting the test data set to a purely statistical filtering strategy, with highly stringent correction for multiple comparisons, demonstrated less than 50% concordance with the previous gene filtering strategy. However, functional analysis of this statistics-based gene list demonstrated similar functional annotations and signaling pathways as that seen in the learning data set. In particular, we validated that pediatric septic shock is characterized by large scale repression of genes related to zinc homeostasis and lymphocyte function. Conclusions: These data demonstrate that the previously reported genome-wide expression signature of pediatric septic shock is applicable to a validation cohort of patients. Experiment Overall Design: Table 1: Clinical and demographic data for all subjects in test data set. Experiment Overall Design: Controls Septic Shock Experiment Overall Design: No. of individual subjects 15 30 Experiment Overall Design: Mean age (years) Â± S.D. 3.1 Â± 3.5 3.2 Â± 2.9 Experiment Overall Design: Mean PRISM Score Â± S.D. n/a 18.9 Â± 12.3 Experiment Overall Design: Gender (Male/Female) 8/7 16/14 Experiment Overall Design: Race (no.) A.A./Black (6) A.A./Black (2) Experiment Overall Design: Asian (4) White (26) White (5) Unreported (2)

Project description:Background Published multi-gene classifiers suggested outcome prediction for patients with stage UICC II colon cancer based on different gene expression signatures. However, there is currently no translation of these classifiers for application in routine diagnostic. Therefore, we aimed at validating own and published gene expression signatures employing methods which enable RNA and protein detection in routine diagnostic specimens. Results Immunohistochemistry was applied to 68 stage UICC II colon cancers to determine the protein expression of five selected previously published classifier genes (CDH17, LAT, CA2, EMR3, and TNFRSF11A). Correlation of protein expression data with clinical outcome within a 5-year post-surgery course failed to separate patients with a disease-free follow-up [Group DF] and relapse [Group R]). In addition, RNA from macrodissected tumor samples from 53 of these 68 patients was profiled on Affymetrix GeneChips (HG-U133 Plus 2.0). Prognostic signatures were generated by Nearest Shrunken Centroids with cross-validation. Although gene expression profiling allowed the identification of differentially expressed genes between the groups DF and R, a stable classification and prognosis signature was not discernable in our data. Furthermore, the application of previously published gene signatures consisting of 22 and 19 genes, respectively, to our gene expression data set using ‘global tests’ and leave-one-out cross-validation was unable to predict clinical outcome (prediction rate 75.5% and 64.2%; n.s.). T-stage was the only independent prognostic factor for relapse in multivariate analysis with established clinical and pathological parameters including microsatellite status. Conclusions Our protein and gene expression analyses currently do not support application of molecular classifiers for prediction of clinical outcome in routine diagnostic as a basis for patient-orientated therapy in stage UICC II colon cancer. Further studies are needed to develop prognosis signatures applicable in patient care.

Dataset Information

Transcription profiling of human Wilms tumor samples to predicting relapse in favorable histology cases

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets