Enhanced identification of significant regulators of gene expression.
ABSTRACT: BACKGROUND:Diseases like cancer will lead to changes in gene expression, and it is relevant to identify key regulatory genes that can be linked directly to these changes. This can be done by computing a Regulatory Impact Factor (RIF) score for relevant regulators. However, this computation is based on estimating correlated patterns of gene expression, often Pearson correlation, and an assumption about a set of specific regulators, normally transcription factors. This study explores alternative measures of correlation, using the Fisher and Sobolev metrics, and an extended set of regulators, including epigenetic regulators and long non-coding RNAs (lncRNAs). Data on prostate cancer have been used to explore the effect of these modifications. RESULTS:A tool for computation of RIF scores with alternative correlation measures and extended sets of regulators was developed and tested on gene expression data for prostate cancer. The study showed that the Fisher and Sobolev metrics lead to improved identification of well-documented regulators of gene expression in prostate cancer, and the sets of identified key regulators showed improved overlap with previously defined gene sets of relevance to cancer. The extended set of regulators lead to identification of several interesting candidates for further studies, including lncRNAs. Several key processes were identified as important, including spindle assembly and the epithelial-mesenchymal transition (EMT). CONCLUSIONS:The study has shown that using alternative metrics of correlation can improve the performance of tools based on correlation of gene expression in genomic data. The Fisher and Sobolev metrics should be considered also in other correlation-based applications.
Project description:BACKGROUND:Almost 16,000 human long non-coding RNA (lncRNA) genes have been identified in the GENCODE project. However, the function of most of them remains to be discovered. The function of lncRNAs and other novel genes can be predicted by identifying significantly enriched annotation terms in already annotated genes that are co-expressed with the lncRNAs. However, such approaches are sensitive to the methods that are used to estimate the level of co-expression. RESULTS:We have tested and compared two well-known statistical metrics (Pearson and Spearman) and two geometrical metrics (Sobolev and Fisher) for identification of the co-expressed genes, using experimental expression data across 19 normal human tissues. We have also used a benchmarking approach based on semantic similarity to evaluate how well these methods are able to predict annotation terms, using a well-annotated set of protein-coding genes. CONCLUSION:This work shows that geometrical metrics, in particular in combination with the statistical metrics, will predict annotation terms more efficiently than traditional approaches. Tests on selected lncRNAs confirm that it is possible to predict the function of these genes given a reliable set of expression data. The software used for this investigation is freely available.
Project description:The k-Nearest Neighbor (kNN) classifier represents a simple and very general approach to classification. Still, the performance of kNN classifiers can often compete with more complex machine-learning algorithms. The core of kNN depends on a "guilt by association" principle where classification is performed by measuring the similarity between a query and a set of training patterns, often computed as distances. The relative performance of kNN classifiers is closely linked to the choice of distance or similarity measure, and it is therefore relevant to investigate the effect of using different distance measures when comparing biomedical data. In this study on classification of cancer data sets, we have used both common and novel distance measures, including the novel distance measures Sobolev and Fisher, and we have evaluated the performance of kNN with these distances on 4 cancer data sets of different type. We find that the performance when using the novel distance measures is comparable to the performance with more well-established measures, in particular for the Sobolev distance. We define a robust ranking of all the distance measures according to overall performance. Several distance measures show robust performance in kNN over several data sets, in particular the Hassanat, Sobolev, and Manhattan measures. Some of the other measures show good performance on selected data sets but seem to be more sensitive to the nature of the classification data. It is therefore important to benchmark distance measures on similar data prior to classification to identify the most suitable measure in each case.
Project description:The androgen-androgen receptor signaling pathway plays an important role in the pathogenesis of prostate cancer. Accordingly, androgen deprivation has been the most effective endocrine therapy for hormone-dependent prostate cancer. Here, we report a novel pregnane X receptor (PXR)-mediated and metabolism-based mechanism to reduce androgenic tone. PXR is a nuclear receptor previously known as a xenobiotic receptor regulating the expression of drug metabolizing enzymes and transporters. We showed that genetic (using a PXR transgene) or pharmacological (using a PXR agonist) activation of PXR lowered androgenic activity and inhibited androgen-dependent prostate regeneration in castrated male mice that received daily injections of testosterone propionate by inducing the expression of cytochrome P450 (CYP)3As and hydroxysteroid sulfotransferase (SULT)2A1, which are enzymes important for the metabolic deactivation of androgens. In human prostate cancer cells, treatment with the PXR agonist rifampicin (RIF) inhibited androgen-dependent proliferation of LAPC-4 cells but had little effect on the growth of the androgen-independent isogenic LA99 cells. Down-regulation of PXR or SULT2A1 in LAPC-4 cells by short hairpin RNA or small interfering RNA abolished the RIF effect, indicating that the inhibitory effect of RIF on androgens was PXR and SULT2A1 dependent. In summary, we have uncovered a novel function of PXR in androgen homeostasis. PXR may represent a novel therapeutic target to lower androgen activity and may aid in the treatment and prevention of hormone-dependent prostate cancer.
Project description:Assessment of dynamic functional brain connectivity based on functional magnetic resonance imaging (fMRI) data is an increasingly popular strategy to investigate temporal dynamics of the brain's large-scale network architecture. Current practice when deriving connectivity estimates over time is to use the Fisher transformation, which aims to stabilize the variance of correlation values that fluctuate around varying true correlation values. It is, however, unclear how well the stabilization of signal variance performed by the Fisher transformation works for each connectivity time series, when the true correlation is assumed to be fluctuating. This is of importance because many subsequent analyses either assume or perform better when the time series have stable variance or adheres to an approximate Gaussian distribution. In this article, using simulations and analysis of resting-state fMRI data, we analyze the effect of applying different variance stabilization strategies on connectivity time series. We focus our investigation on the Fisher transformation, the Box-Cox (BC) transformation and an approach that combines both transformations. Our results show that, if the intention of stabilizing the variance is to use metrics on the time series, where stable variance or a Gaussian distribution is desired (e.g., clustering), the Fisher transformation is not optimal and may even skew connectivity time series away from being Gaussian. Furthermore, we show that the suboptimal performance of the Fisher transformation can be substantially improved by including an additional BC transformation after the dynamic functional connectivity time series has been Fisher transformed.
Project description:We develop a general method to identify gene networks from pair-wise correlations between genes in a microarray data set and apply it to a public prostate cancer gene expression data from 69 primary prostate tumors. We define the degree of a node as the number of genes significantly associated with the node and identify hub genes as those with the highest degree. The correlation network was pruned using transcription factor binding information in VisANT (http://visant.bu.edu/) as a biological filter. The reliability of hub genes was determined using a strict permutation test. Separate networks for normal prostate samples, and prostate cancer samples from African Americans (AA) and European Americans (EA) were generated and compared. We found that the same hubs control disease progression in AA and EA networks. Combining AA and EA samples, we generated networks for low low (<7) and high (?7) Gleason grade tumors. A comparison of their major hubs with those of the network for normal samples identified two types of changes associated with disease: (i) Some hub genes increased their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with gain of regulatory control in cancer (e.g. possible turning on of oncogenes). (ii) Some hubs reduced their degree in the tumor network compared to their degree in the normal network, suggesting that these genes are associated with loss of regulatory control in cancer (e.g. possible loss of tumor suppressor genes). A striking result was that for both AA and EA tumor samples, STAT5a, CEBPB and EGR1 are major hubs that gain neighbors compared to the normal prostate network. Conversely, HIF-l? is a major hub that loses connections in the prostate cancer network compared to the normal prostate network. We also find that the degree of these hubs changes progressively from normal to low grade to high grade disease, suggesting that these hubs are master regulators of prostate cancer and marks disease progression. STAT5a was identified as a central hub, with ~120 neighbors in the prostate cancer network and only 81 neighbors in the normal prostate network. Of the 120 neighbors of STAT5a, 57 are known cancer related genes, known to be involved in functional pathways associated with tumorigenesis. Our method is general and can easily be extended to identify and study networks associated with any two phenotypes.
Project description:There is a preclinical evidence that the oral administration of d,l-sulforaphane (SFN) can decrease the incidence or burden of early-stage prostate cancer [prostatic intraepithelial neoplasia (PIN)] and well-differentiated cancer (WDC) but not late-stage poorly differentiated cancer (PDC). Because SFN treatment induces cytoprotective autophagy in cultured human prostate cancer cells, the present study tested the hypothesis that chemopreventive efficacy of SFN could be augmented by the pharmacologic inhibition of autophagy using chloroquine (CQ). Incidence of PDC characterized by prostate weight of more than 1 g was significantly lower in the SFN + CQ group than in control (P = 0.004), CQ group (P = 0.026), or SFN group (P = 0.002 by Fisher exact test). Average size of the metastatic lymph node was lower by about 42% in the SFN + CQ group than in control (P = 0.043 by Wilcoxon test). On the other hand, the SFN + CQ combination was not superior to SFN alone with respect to inhibition of incidence or burden of microscopic PIN or WDC. SFN treatment caused in vivo autophagy as evidenced by transmission electron microscopy. Mechanistic studies showed that prevention of prostate cancer and metastasis by the SFN + CQ combination was associated with decreased cell proliferation, increased apoptosis, alterations in protein levels of autophagy regulators Atg5 and phospho-mTOR, and suppression of biochemical features of epithelial-mesenchymal transition. Plasma proteomics identified protein expression signature that may serve as biomarker of SFN + CQ exposure/response. This study offers a novel combination regimen for future clinical investigations for prevention of prostate cancer in humans.
Project description:Caveolin-1 (CAV1) is over-expressed in prostate cancer (PCa) and is associated with adverse prognosis, but the molecular mechanisms linking CAV1 expression to disease progression are poorly understood. Extensive gene expression correlation analysis, quantitative multiplex imaging of clinical samples, and analysis of the CAV1-dependent transcriptome, supported that CAV1 re-programmes TGF? signalling from tumour suppressive to oncogenic (i.e. induction of SLUG, PAI-1 and suppression of CDH1, DSP, CDKN1A). Supporting such a role, CAV1 knockdown led to growth arrest and inhibition of cell invasion in prostate cancer cell lines. Rationalized RNAi screening and high-content microscopy in search for CAV1 upstream regulators revealed integrin beta1 (ITGB1) and integrin associated proteins as CAV1 regulators. Our work suggests TGF? signalling and beta1 integrins as potential therapeutic targets in PCa over-expressing CAV1, and contributes to better understand the paradoxical dual role of TGF? in tumour biology.
Project description:The dynamic and never exactly repeatable tumor transcriptomic profile of people affected by the same form of cancer requires a personalized and time-sensitive approach of the gene therapy. The Gene Master Regulators (GMRs) were defined as genes whose highly controlled expression by the homeostatic mechanisms commands the cell phenotype by modulating major functional pathways through expression correlation with their genes. The Gene Commanding Height (GCH), a measure that combines the expression control and expression correlation with all other genes, is used to establish the gene hierarchy in each cell phenotype. We developed the experimental protocol, the mathematical algorithm and the computer software to identify the GMRs from transcriptomic data in surgically removed tumors, biopsies or blood from cancer patients. The GMR approach is illustrated with applications to our microarray data on human kidney, thyroid and prostate cancer samples, and on thyroid, prostate and blood cancer cell lines. We proved experimentally that each patient has his/her own GMRs, that cancer nuclei and surrounding normal tissue are governed by different GMRs, and that manipulating the expression has larger consequences for genes with higher GCH. Therefore, we launch the hypothesis that silencing the GMR may selectively kill the cancer cells from a tissue.
Project description:Recent studies indicate that microRNAs (miRNAs) are mechanistically involved in the development of various human malignancies, suggesting that they represent a promising new class of cancer biomarkers. However, previously reported methods for measuring miRNA expression consume large amounts of tissue, prohibiting high-throughput miRNA profiling from typically small clinical samples such as excision or core needle biopsies of breast or prostate cancer. Here we describe a novel combination of linear amplification and labeling of miRNA for highly sensitive expression microarray profiling requiring only picogram quantities of purified microRNA.Comparison of microarray and qRT-PCR measured miRNA levels from two different prostate cancer cell lines showed concordance between the two platforms (Pearson correlation R2 = 0.81); and extension of the amplification, labeling and microarray platform was successfully demonstrated using clinical core and excision biopsy samples from breast and prostate cancer patients. Unsupervised clustering analysis of the prostate biopsy microarrays separated advanced and metastatic prostate cancers from pooled normal prostatic samples and from a non-malignant precursor lesion. Unsupervised clustering of the breast cancer microarrays significantly distinguished ErbB2-positive/ER-negative, ErbB2-positive/ER-positive, and ErbB2-negative/ER-positive breast cancer phenotypes (Fisher exact test, p = 0.03); as well, supervised analysis of these microarray profiles identified distinct miRNA subsets distinguishing ErbB2-positive from ErbB2-negative and ER-positive from ER-negative breast cancers, independent of other clinically important parameters (patient age; tumor size, node status and proliferation index).In sum, these findings demonstrate that optimized high-throughput microRNA expression profiling offers novel biomarker identification from typically small clinical samples such as breast and prostate cancer biopsies.
Project description:Rifampin (RIF) upregulates CYP 450 isoenzymes, potentially lowering efavirenz (EFV) exposure. The US EFV package insert recommends an EFV dose increase for patients on RIF weighing ≥50 kg. We conducted a pharmacokinetic study to evaluate EFV trough concentrations (Cmin) and human immunodeficiency virus (HIV) virologic suppression in patients on EFV (600 mg) and RIF-based tuberculosis treatment in the multicenter randomized trial (ACTG A5221).EFV Cmin was measured 20-28 hours post-EFV dose at weeks 4, 8, 16, 24 on-RIF and weeks 4, 8 off-RIF. Results were evaluated with 2-sided Wilcoxon rank-sum, χ(2), Fisher exact tests and logistic regression (5% type I error rate).Seven hundred eighty patients received EFV; 543 provided ≥1 EFV Cmin. Median weight was 52.8 kg (interquartile range [IQR], 48.0-59.5), body mass index 19.4 kg/m(2) (IQR, 17.5-21.6), and age 34 years (IQR, 29-41); 63% were male, 74% black. Median Cmin was 1.96 µg/mL on-RIF versus 1.80 off-RIF (P = .067). Cmin were significantly higher on-RIF versus off-RIF in blacks (2.08 vs 1.75, P = .005). Weight ≥60 kg on-RIF, compared to <60 kg, was associated with lower EFV Cmin (1.68 vs 2.02, P = .021). However, weight ≥60 kg was associated with more frequent HIV RNA < 400 copies/mL at week 48, compared to weight <60 kg (81.9% vs 73.8%, P = .023).EFV and RIF-based tuberculosis therapy coadministration was associated with a trend toward higher, not lower, EFV Cmin compared to EFV alone. Patients weighing ≥60 kg had lower median EFV Cmin versus those <60 kg, but there was no association of higher weight with reduced virologic suppression. These data do not support weight-based dosing of EFV with RIF.