Dataset Information

Identification of Prognostic Biomarkers for Breast Cancer Metastasis Using Penalized Additive Hazards Regression Model

ABSTRACT:

Background:

Breast cancer (BC) has been reported as one of the most common cancers

SUBMITTER: Tapak L

PROVIDER: S-EPMC10034277 | biostudies-literature | 2023 Jan

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:An ever‑increasing number of long noncoding (lnc)RNAs has been identified in breast cancer. The present study aimed to establish an lncRNA signature for predicting survival in breast cancer. RNA expression profiling was performed using microarray gene expression data from the National Center for Biotechnology Information Gene Expression Omnibus, followed by the identification of breast cancer‑related preserved modules using weighted gene co‑expression network (WGCNA) network analysis. From the lncRNAs identified in these preserved modules, prognostic lncRNAs were selected using univariate Cox regression analysis in combination with the L1‑penalized (LASSO) Cox‑proportional Hazards (Cox‑PH) model. A risk score based on these prognostic lncRNAs was calculated and used for risk stratification. Differentially expressed RNAs (DERs) in breast cancer were identified using MetaDE. Gene Set Enrichment Analysis pathway enrichment analysis was conducted for these prognostic lncRNAs and the DERs related to the lncRNAs in the preserved modules. A total of five preserved modules comprising 73 lncRNAs were mined. An eight‑lncRNA signature (IGHA1, IGHGP, IGKV2‑28, IGLL3P, IGLV3‑10, AZGP1P1, LINC00472 and SLC16A6P1) was identified using the LASSO Cox‑PH model. Risk score based on these eight lncRNAs could classify breast cancer patients into two groups with significantly different survival times. The eight‑lncRNA signature was validated using three independent cohorts. These prognostic lncRNAs were significantly associated with the cell adhesion molecules pathway, JAK‑signal transducer and activator of transcription 5A pathway, and erbb pathway and are potentially involved in regulating angiotensin II receptor type 1, neuropeptide Y receptor Y1, KISS1 receptor, and C‑C motif chemokine ligand 5. The developed eight‑lncRNA signature may have clinical implications for predicting prognosis in breast cancer. Overall, this study provided possible molecular targets for the development of novel therapies against breast cancer.

Project description:BackgroundThe successful identification of breast cancer (BRCA) prognostic biomarkers is essential for the strategic interference of BRCA patients. Recently, various methods have been proposed for exploring a small prognostic gene set that can distinguish the high-risk group from the low-risk group.MethodsRegularized Cox proportional hazards (RCPH) models were proposed to discover prognostic biomarkers of BRCA from gene expression data. Firstly, the maximum connected network with 1142 genes by mapping 956 differentially expressed genes (DEGs) and 677 previously BRCA-related genes into the gene regulatory network (GRN) was constructed. Then, the 72 union genes of the four feature gene sets identified by Lasso-RCPH, Enet-RCPH, [Formula: see text]-RCPH and SCAD-RCPH models were recognized as the robust prognostic biomarkers. These biomarkers were validated by literature checks, BRCA-specific GRN and functional enrichment analysis. Finally, an index of prognostic risk score (PRS) for BRCA was established based on univariate and multivariate Cox regression analysis. Survival analysis was performed to investigate the PRS on 1080 BRCA patients from the internal validation. Particularly, the nomogram was constructed to express the relationship between PRS and other clinical information on the discovery dataset. The PRS was also verified on 1848 BRCA patients of ten external validation datasets or collected cohorts.ResultsThe nomogram highlighted that the importance of PRS in guiding significance for the prognosis of BRCA patients. In addition, the PRS of 301 normal samples and 306 tumor samples from five independent datasets showed that it is significantly higher in tumors than in normal tissues ([Formula: see text]). The protein expression profiles of the three genes, i.e., ADRB1, SAV1 and TSPAN14, involved in the PRS model demonstrated that the latter two genes are more strongly stained in tumor specimens. More importantly, external validation illustrated that the high-risk group has worse survival than the low-risk group ([Formula: see text]) in both internal and external validations.ConclusionsThe proposed pipelines of detecting and validating prognostic biomarker genes for BRCA are effective and efficient. Moreover, the proposed PRS is very promising as an important indicator for judging the prognosis of BRCA patients.

Project description:IntroductionMachine learning algorithms such as elastic net regression and backward selection provide a unique and powerful approach to model building given a set of psychosocial predictors of smoking lapse measured repeatedly via ecological momentary assessment (EMA). Understanding these predictors may aid in developing interventions for smoking lapse prevention.MethodsIn a randomized-controlled smoking cessation trial, smartphone-based EMAs were collected from 92 participants following a scheduled quit date. This secondary analysis utilized elastic net-penalized cox proportional hazards regression and model approximation via backward elimination to (1) optimize a predictive model of time to first lapse and (2) simplify that model to its core constituent predictors to maximize parsimony and generalizability.ResultsElastic net proportional hazards regression selected 17 of 26 possible predictors from 2065 EMAs to model time to first lapse. The predictors with the highest magnitude regression coefficients were having consumed alcohol in the past hour, being around and interacting with a smoker, and having cigarettes easily available. This model was reduced using backward elimination, retaining five predictors and approximating to 93.9% of model fit. The retained predictors included those mentioned above as well as feeling irritable and being in areas where smoking is either discouraged or allowed (as opposed to not permitted).ConclusionsThe strongest predictors of smoking lapse were environmental in nature (e.g., being in smoking-permitted areas) as opposed to internal factors such as psychological affect. Interventions may be improved by a renewed focus of interventions on these predictors.ImplicationsThe present study demonstrated the utility of machine learning algorithms to optimize the prediction of time to smoking lapse using EMA data. The two models generated by the present analysis found that environmental factors were most strongly related to smoking lapse. The results support the use of machine learning algorithms to investigate intensive longitudinal data, and provide a foundation for the development of highly tailored, just-in-time interventions that can target on multiple antecedents of smoking lapse.

Dataset Information

Identification of Prognostic Biomarkers for Breast Cancer Metastasis Using Penalized Additive Hazards Regression Model

Background:

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets