Systems biology approach to studying proliferation-dependent prognostic subnetworks in breast cancer.
ABSTRACT: Tumor proliferative capacity is a major biological correlate of breast tumor metastatic potential. In this paper, we developed a systems approach to investigate associations among gene expression patterns, representative protein-protein interactions, and the potential for clinical metastases, to uncover novel survival-related subnetwork signatures as a function of tumor proliferative potential. Based on the statistical associations between gene expression patterns and patient outcomes, we identified three groups of survival prognostic subnetwork signatures (SPNs) corresponding to three proliferation levels. We discovered 8 SPNs in the high proliferation group, 8 SPNs in the intermediate proliferation group, and 6 SPNs in the low proliferation group. We observed little overlap of SPNs between the three proliferation groups. The enrichment analysis revealed that most SPNs were enriched in distinct signaling pathways and biological processes. The SPNs were validated on other cohorts of patients, and delivered high accuracy in the classification of metastatic vs non-metastatic breast tumors. Our findings indicate that certain biological networks underlying breast cancer metastasis differ in a proliferation-dependent manner. These networks, in combination, may form the basis of highly accurate prognostic classification models and may have clinical utility in guiding therapeutic options for patients.
Project description:We have identified nine highly connected and differentially expressed gene subnetworks between aggressive primary tumors and metastatic lesions in endometrial carcinomas. We implemented a novel pipeline combining gene set and network approaches, which here allows integration of protein-protein interactions and gene expression data. The resulting subnetworks are significantly associated with disease progression across tumor stages from complex atypical hyperplasia, primary tumors to metastatic lesions. The nine subnetworks include genes related to metastasizing features such as epithelial-mesenchymal transition (EMT), hypoxia and cell proliferation. TCF4 and TWIST2 were found as central genes in the subnetwork related to EMT. Two of the identified subnetworks display statistically significant association to patient survival, which were further supported by an independent validation in the data from The Cancer Genome Atlas data collection. The first subnetwork contains genes related to cell proliferation and cell cycle, while the second contains genes involved in hypoxia such as HIF1A and EGLN3. Our findings provide a promising context to elucidate the biological mechanisms of metastasis, suggest potential prognostic markers and further identify therapeutic targets. The pipeline R source code is freely available, including permutation tests to assess statistical significance of the identified subnetworks.
Project description:We used a systems biology approach to identify and score protein interaction subnetworks whose activity patterns are discriminative of late stage human colorectal cancer (CRC) versus control in colonic tissue. We conducted two gel-based proteomics experiments to identify significantly changing proteins between normal and late stage tumor tissues obtained from an adequately sized cohort of human patients. A total of 67 proteins identified by these experiments was used to seed a search for protein-protein interaction subnetworks. A scoring scheme based on mutual information, calculated using gene expression data as a proxy for subnetwork activity, was developed to score the targets in the subnetworks. Based on this scoring, the subnetwork was pruned to identify the specific protein combinations that were significantly discriminative of late stage cancer versus control. These combinations could not be discovered using only proteomics data or by merely clustering the gene expression data. We then analyzed the resultant pruned subnetwork for biological relevance to human CRC. A number of the proteins in these smaller subnetworks have been associated with the progression (CSNK2A2, PLK1, and IGFBP3) or metastatic potential (PDGFRB) of CRC. Others have been recently identified as potential markers of CRC (IFITM1), and the role of others is largely unknown in this disease (CCT3, CCT5, CCT7, and GNA12). The functional interactions represented by these signatures provide new experimental hypotheses that merit follow-on validation for biological significance in this disease. Overall the method outlines a quantitative approach for integrating proteomics data, gene expression data, and the wealth of accumulated legacy experimental data to discover significant protein subnetworks specific to disease.
Project description:Breast cancer has a long natural history. Established and emerging biologic markers address overall risk but not necessarily timing of recurrence. 346 adjuvant naïve breast cancer cases from Guy's Hospital with 23 years minimum follow-up and archival blocks were recut and reassessed for hormone-receptors (HR), HER2-receptor and grade. Disease-specific survival (DSS) was analyzed by recursive partitioning. To validate insights from this analysis, gene-signatures (proliferative and HR-negative) were evaluated for their ability to predict early versus late metastatic risk in 683 node-negative, adjuvant naïve breast cancers annotated with expression microarray data. Risk partitioning showed that adjuvant naïve node-negative outcome risk was primarily partitioned by tumor receptor status and grade but not tumor size. HR-positive and HER2-negative (HRpos) risk was partitioned by tumor grade; low grade cases have very low early risk but a 20% fall-off in DSS 10 or more years after diagnosis. Higher grade HRpos cases have risk over >20 years. Triple-negative (Tneg) and HER2-positive (HER2pos) cases DSS events occurred primarily within the first 5 years. Among node-positive cases, only low grade conferred late risk, suggesting that proliferative gene signatures that identify proliferation would be important for predicting early but not late recurrence. Using pooled data from four publicly available data sets for node-negative tumors annotated with gene expression and outcome data, we evaluated four prognostic gene signatures: two proliferation-based and two immune function-based. Tumor proliferative capacity predicted early but not late metastatic risk for HRpos cases. The immune function or HRneg specific signatures predicted only early metastatic risk in Tneg and HER2pos cases. Breast cancer prognostic signatures need to inform both risk and timing of metastatic events and may best be applied within subsets. Current signatures predict for outcome risk within 5 years of diagnosis. Predictors of late risk for HR positive disease are needed.
Project description:Biomarkers lie at the heart of precision medicine. Surprisingly, while rapid genomic profiling is becoming ubiquitous, the development of biomarkers usually involves the application of bespoke techniques that cannot be directly applied to other datasets. There is an urgent need for a systematic methodology to create biologically-interpretable molecular models that robustly predict key phenotypes. Here we present SIMMS (Subnetwork Integration for Multi-Modal Signatures): an algorithm that fragments pathways into functional modules and uses these to predict phenotypes. We apply SIMMS to multiple data types across five diseases, and in each it reproducibly identifies known and novel subtypes, and makes superior predictions to the best bespoke approaches. To demonstrate its ability on a new dataset, we profile 33 genes/nodes of the PI3K pathway in 1734 FFPE breast tumors and create a four-subnetwork prediction model. This model out-performs a clinically-validated molecular test in an independent cohort of 1742 patients. SIMMS is generic and enables systematic data integration for robust biomarker discovery.
Project description:It is unclear if earlier onset (<40 years) and greater proliferative capacity confer an equally poor prognosis to endocrine-dependent and endocrine-independent breast cancers. Available outcome (distant metastasis-free survival, DMFS) and expression microarray data from 621 adjuvant treatment-naïve, node-negative primary breast cancers were pooled for prognostic evaluation of age-at-diagnosis (< 40 years vs. ? 40 years) and tumor proliferative capacity relative to estrogen receptor status (n = 400 ER-positive, n = 221 ER-negative). Transcriptome measures of proliferative capacity included a proliferation score (PS) based on a 61-gene proliferation signature and the single gene surrogate, FOXM1. Kaplan-Meier analyses revealed no significant difference in DMFS between ER-positive and ER-negative cases >5 years after diagnosis. In contrast, younger age and higher proliferative capacity resulted in significantly more metastatic events cumulated over 15 years, but only in ER-positive breast cancers where positive correlations between age and proliferation were observed. While strongly correlated, FOXM1 and PS did not appear equivalent in relation to age and prognosis. The poor prognosis associated with breast cancer arising before age 40 or with higher proliferative capacity pertains only to endocrine-dependent (ER-positive) breast cancer, indicating that different biological processes drive the metastatic potential of ER-negative breast cancer.
Project description:Recent studies showed that somatic cancer mutations target genes that are in specific signaling and cellular pathways. However, in each patient only a few of the pathway genes are mutated. Current approaches consider only existing pathways and ignore the topology of the pathways. For this reason, new efforts have been focused on identifying significantly mutated subnetworks and associating them with cancer characteristics. We applied two well-established network analysis approaches to identify significantly mutated subnetworks in the breast cancer genome. We took network topology into account for measuring the mutation similarity of a gene-pair to allow us to infer the significantly mutated subnetworks. Our goals are to evaluate whether the identified subnetworks can be used as biomarkers for predicting breast cancer patient survival and provide the potential mechanisms of the pathways enriched in the subnetworks, with the aim of improving breast cancer treatment. Using the copy number alteration (CNA) datasets from the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) study, we identified a significantly mutated yet clinically and functionally relevant subnetwork using two graph-based clustering algorithms. The mutational pattern of the subnetwork is significantly associated with breast cancer survival. The genes in the subnetwork are significantly enriched in retinol metabolism KEGG pathway. Our results show that breast cancer treatment with retinoids may be a potential personalized therapy for breast cancer patients since the CNA patterns of the breast cancer patients can imply whether the retinoids pathway is altered. We also showed that applying multiple bioinformatics algorithms at the same time has the potential to identify new network-based biomarkers, which may be useful for stratifying cancer patients for choosing optimal treatments.
Project description:BACKGROUND:The main cause of death of breast cancer patients is not the primary tumor itself but the metastatic disease. Identifying breast cancer-specific signatures for metastasis and learning more about the nature of the genes involved in the metastatic process would 1) improve our understanding of the mechanisms of cancer progression and 2) reveal new therapeutic targets. Previous studies showed that the transcriptional regulator megakaryoblastic leukemia-1 (Mkl1) induces tenascin-C expression in normal and transformed mammary epithelial cells. Tenascin-C is known to be expressed in metastatic niches, is highly induced in cancer stroma and promotes breast cancer metastasis to the lung. METHODS:Using HC11 mammary epithelial cells overexpressing different Mkl1 constructs, we devised a subtractive transcript profiling screen to identify the mechanism by which Mkl1 induces a gene set co-regulated with tenascin-C. We performed computational analysis of the Mkl1 target genes and used cell biological experiments to confirm the effect of these gene products on cell behavior. To analyze whether this gene set is prognostic of accelerated cancer progression in human patients, we used the bioinformatics tool GOBO that allowed us to investigate a large breast tumor data set linked to patient data. RESULTS:We discovered a breast cancer-specific set of genes including tenascin-C, which is regulated by Mkl1 in a SAP domain-dependent, serum response factor-independent manner and is strongly implicated in cell proliferation, cell motility and cancer. Downregulation of this set of transcripts by overexpression of Mkl1 lacking the SAP domain inhibited cell growth and cell migration. Many of these genes are direct Mkl1 targets since their promoter-reporter constructs were induced by Mkl1 in a SAP domain-dependent manner. Transcripts, most strongly reduced in the absence of the SAP domain were mechanoresponsive. Finally, expression of this gene set is associated with high-proliferative poor-outcome classes in human breast cancer and a strongly reduced survival rate for patients independent of tumor grade. CONCLUSIONS:This study highlights a crucial role for the transcriptional regulator Mkl1 and its SAP domain during breast cancer progression. We identified a novel gene set that correlates with bad prognosis and thus may help in deciding the rigor of therapy.
Project description:BACKGROUND: Study of biological networks is an essential first step to understand the complex functions they govern in different organisms. The topology of interactions that define how biological networks operate is often determined through high-throughput experiments. Noisy nature of high-throughput experiments, however, can result in multiple alternative network topologies that explain this data equally well. One key step to resolve the differences is to identify the subnetworks which appear significantly more frequently in a biological network data set than expected. METHOD: We present a method named SiS (Significant Subnetworks) to find subnetworks with the largest probability to appear in a collection of biological networks. We define these subnetworks as the most probable subnetworks. SiS summarizes the interactions in the given collection of networks in a special template network. It uses the template network to guide the search for most probable subnetworks. It computes the lower and upper bound scores on how good the potential solutions are (i.e., the number of input networks that contain the subnetwork). As the search continues, it tightens the bound dynamically and prunes a massive number of unpromising solutions in that process. RESULTS AND CONCLUSIONS: Experiments on comprehensive data sets depict that the most probable subnetworks found by SiS in a large collection of networks are also very frequent as well. In metabolic network data set, we found that subnetworks in eukaryote are more conserved than those of prokaryote. SiS also scales well to large data sets and subnetworks and runs orders of magnitude faster than an existing method, MULE. Depending on the size of the subnetwork in the same data set, the running time of SiS ranges from a few seconds to minutes; MULE, on the other hand, runs either for hours or does not even finish in days. In human transcription regulatory network data set, SiS finds a large backbone subnetwork that appears frequently regardless of diverse cell types.
Project description:Aminobisphosphonates, such as zoledronic acid (ZA), have shown potential in the treatment of different malignancies, including colorectal carcinoma (CRC). Yet, their clinical exploitation is limited by their high bone affinity and modest bioavailability. Here, ZA is encapsulated into the aqueous core of spherical polymeric nanoparticles (SPNs), whose size and architecture resemble that of biological vesicles. On V?2 T cells, derived from the peripheral blood of healthy donors and CRC patients, ZA-SPNs induce proliferation and trigger activation up to three orders of magnitude more efficiently than soluble ZA. These activated V?2 T cells kill CRC cells and tumor spheroids, and are able to migrate toward CRC cells in a microfluidic system. Notably, ZA-SPNs can also stimulate the proliferation of V?2 T cells from the tumor-infiltrating lymphocytes of CRC patients and boost their cytotoxic activity against patients' autologous tumor organoids. These data represent a first step toward the use of nanoformulated ZA for immunotherapy in CRC patients.
Project description:Breast cancer mortality predominantly results from dormant micrometastases that emerge as fatal outgrowths years after initial diagnosis. In order to gain insights concerning factors associated with emergence of liver metastases, we recreated spontaneous dormancy in an all-human ex vivo hepatic microphysiological system (MPS). Seeding this MPS with small numbers (<0.05% by cell count) of the aggressive MDA-MB-231 breast cancer cell line, two populations formed: actively proliferating ("growing"; EdU+), and spontaneously quiescent ("dormant"; EdU-). Following treatment with a clinically standard chemotherapeutic, the proliferating cells were eliminated and only quiescent cells remained; this residual dormant population could then be induced to a proliferative state ("emergent"; EdU+) by physiologically-relevant inflammatory stimuli, lipopolysaccharide (LPS) and epidermal growth factor (EGF). Multiplexed proteomic analysis of the MPS effluent enabled elucidation of key factors and processes that correlated with the various tumor cell states, and candidate biomarkers for actively proliferating (either primary or secondary emergence) versus dormant metastatic cells in liver tissue. Dormancy was found to be associated with signaling reflective of cellular quiescence even more strongly than the original tumor-free liver tissue, whereas proliferative nodules presented inflammatory signatures. Given the minimal tumor burden, these markers likely represent changes in the tumor microenvironment rather than in the tumor cells. A computational decision tree algorithm applied to these signatures indicated the potential of this MPS for clinical discernment of each metastatic stage from blood protein analysis.