Construction of a specific SVM classifier and identification of molecular markers for lung adenocarcinoma based on lncRNA-miRNA-mRNA network.
ABSTRACT: Background:Novel diagnostic predictors and drug targets are needed for LUAD (lung adenocarcinoma). We aimed to build a specific SVM (support vector machine) classifier for diagnosis of LUAD and identify molecular markers with prognostic value for LUAD. Methods:The expression differences of miRNAs, lncRNAs and mRNAs between LUAD and normal samples were compared using data from TCGA (The Cancer Genome Atlas) database. A LUAD related miRNA-lncRNA-mRNA network was constructed, based on which feature genes were selected for the construction of LUAD specific SVM classifier. The robustness and transferability of SVM classifier were validated using gene expression profile datasets GSE43458 and GSE10072. Prognostic markers were identified from the network. A set of LUAD-related differentially expressed miRNAs, lncRNAs and miRNAs were identified and a LUAD related miRNA-lncRNA-mRNA network was obtained. The LUAD specific SVM classifier constructed on the basis of the network was robust and efficient for classification of samples from TCGA dataset and two independent validation datasets. Results:Eight RNAs with prognostic value were identified, including hsa-miR-96, hsa-miR-204, PGM5P2 (phosphoglucomutase 5 pseudogene 2), SFTA1P (surfactant associated 1), RGS20 (regulator of G protein signaling 20), RGS9BP (RGS9-binding protein), FGB (fibrinogen beta chain) and INA (alpha-internexin). Among them, RGS20 and INA were regulated by hsa-miR-96. RGS20 was also regulated by hsa-miR-204, which was a potential target of SFTA1P. Conclusion:The LUAD specific SVM classifier may serve as a novel diagnostic predictor. hsa-miR-96, hsa-miR-204, PGM5P2, SFTA1P, RGS20, RGS9BP, FGB and INA may serve as prognostic markers in clinical practice.
Project description:Cervical cancer is the second most commonly diagnosed cancer in women. Novel prognostic biomarkers are required to predict the progression of cervical cancer. Cervical cancer expression data were obtained from The Cancer Genome Atlas (TCGA) database. MicroRNAs (miRNAs) significantly differentially expressed between early‑ and advanced‑stage samples were identified by expression analysis. An optimal subset of signature miRNAs for pathologic stage prediction was delineated using the random forest algorithm and was used for the construction of a cervical cancer‑specific support vector machine (SVM) classifier. The roles of signature miRNAs in cervical cancer were analyzed by functional annotation. In total, 44 significantly differentially expressed miRNAs were identified. An optimal subset of 7 signature miRNAs was identified, including hsa‑miR‑144, hsa‑miR‑147b, hsa‑miR‑218‑2, hsa‑miR‑425, hsa‑miR‑451, hsa‑miR‑483 and hsa‑miR‑486. The signature miRNAs were used to construct an SVM classifier and exhibited a good performance in predicting pathologic stages of samples. SVM classification was found to be an independent prognostic factor. Functional enrichment analysis indicated that these signature miRNAs are involved in tumorigenesis. In conclusion, the subset of signature miRNAs could potentially serve as a novel diagnostic and prognostic predictor for cervical cancer.
Project description:Breast cancer is a heterogeneous disease and one of the most common cancers among women. Recently, microRNAs (miRNAs) have been used as biomarkers due to their effective role in cancer diagnosis. This study proposes a support vector machine (SVM)-based classifier SVM-BRC to categorize patients with breast cancer into early and advanced stages. SVM-BRC uses an optimal feature selection method, inheritable bi-objective combinatorial genetic algorithm, to identify a miRNA signature which is a small set of informative miRNAs while maximizing prediction accuracy. MiRNA expression profiles of a 386-patient cohort of breast cancer were retrieved from The Cancer Genome Atlas. SVM-BRC identified 34 of 503 miRNAs as a signature and achieved a 10-fold cross-validation mean accuracy, sensitivity, specificity, and Matthews correlation coefficient of 80.38%, 0.79, 0.81, and 0.60, respectively. Functional enrichment of the 10 highest ranked miRNAs was analysed in terms of Kyoto Encyclopedia of Genes and Genomes and Gene Ontology annotations. Kaplan-Meier survival analysis of the highest ranked miRNAs revealed that four miRNAs, hsa-miR-503, hsa-miR-1307, hsa-miR-212 and hsa-miR-592, were significantly associated with the prognosis of patients with breast cancer.
Project description:Background:Osteosarcoma, which originates in the mesenchymal tissue, is the prevalent primary solid malignancy of the bone. It is of great importance to explore the mechanisms of metastasis and recurrence, which are two primary reasons accounting for the high death rate in osteosarcoma. Data and methods:Three miRNA expression profiles related to osteosarcoma were downloaded from GEO DataSets. Differentially expressed miRNAs (DEmiRs) were screened using MetaDE.ES of the MetaDE package. A support vector machine (SVM) classifier was constructed using optimal miRNAs, and its prediction efficiency for recurrence was detected in independent datasets. Finally, a co-expression network was constructed based on the DEmiRs and their target genes. Results:In total, 78 significantly DEmiRs were screened. The SVM classifier constructed by 15 miRNAs could accurately classify 58 samples in 65 samples (89.2%) in the GSE39040 database, which was validated in another two databases, GSE39052 (84.62%, 22/26) and GSE79181 (91.3%, 21/23). Cox regression showed that four miRNAs, including hsa-miR-10b, hsa-miR-1227, hsa-miR-146b-3p, and hsa-miR-873, significantly correlated with tumor recurrence time. There were 137, 147, 145, and 77 target genes of the above four miRNAs, respectively, which were assigned to 17 gene ontology functionally annotated terms and 14 Kyoto Encyclopedia of Genes and Genomes pathways. Among them, the "Osteoclast differentiation" pathway contained a total of seven target genes and was analyzed further. Conclusion:The 15-miRNAs-based SVM classifier provides a potential useful tool to predict the recurrence of osteosarcoma. Our results suggest the possible mechanisms of osteosarcoma metastasis and recurrence and provide fresh DEmiRs as potential biomarkers or therapeutic targets for osteosarcoma.
Project description:Lung cancer is a leading global cause of cancer-related death, and lung adenocarcinoma (LUAD) accounts for ~ 50% of lung cancer. Here, we screened for novel and specific biomarkers of LUAD by searching for differentially expressed mRNAs (DEmRNAs) and microRNAs (DEmiRNAs) in LUAD patient expression data within The Cancer Genome Atlas (TCGA). The identified optimal diagnostic miRNA biomarkers were used to establish classification models (including support vector machine, decision tree, and random forest) to distinguish between LUAD and adjacent tissues. We then predicted the targets of identified optimal diagnostic miRNA biomarkers, functionally annotated these target genes, and performed receiver operating characteristic curve analysis of the respective DEmiRNA biomarkers, their target DEmRNAs, and combinations of DEmiRNA biomarkers. We validated the expression of selected DEmiRNA biomarkers by quantitative real-time PCR (qRT-PCR). In all, we identified a total of 13 DEmiRNAs, 2301 DEmRNAs and 232 DEmiRNA-target DEmRNA pairs between LUAD and adjacent tissues and selected nine DEmiRNAs (hsa-mir-486-1, hsa-mir-486-2, hsa-mir-153, hsa-mir-210, hsa-mir-9-1, hsa-mir-9-2, hsa-mir-9-3, hsa-mir-577, and hsa-mir-4732) as optimal LUAD-specific biomarkers with great diagnostic value. The predicted targets of these nine DEmiRNAs were significantly enriched in transcriptional misregulation in cancer and central carbon metabolism. Our qRT-PCR results were generally consistent with our integrated analysis. In summary, our study identified nine DEmiRNAs that may serve as potential diagnostic biomarkers of LUAD. Functional annotation of their target DEmRNAs may provide information on their roles in LUAD.
Project description:Lung adenocarcinoma is a multifactorial disease. MicroRNA (miRNA) expression profiles are extensively used for discovering potential theranostic biomarkers of lung cancer. This work proposes an optimized support vector regression (SVR) method called SVR-LUAD to simultaneously identify a set of miRNAs referred to the miRNA signature for estimating the survival time of lung adenocarcinoma patients using their miRNA expression profiles. SVR-LUAD uses an inheritable bi-objective combinatorial genetic algorithm to identify a small set of informative miRNAs cooperating with SVR by maximizing estimation accuracy. SVR-LUAD identified 18 out of 332 miRNAs using 10-fold cross-validation and achieved a correlation coefficient of 0.88?±?0.01 and mean absolute error of 0.56?±?0.03 year between real and estimated survival time. SVR-LUAD performs well compared to some well-recognized regression methods. The miRNA signature consists of the 18 miRNAs which strongly correlates with lung adenocarcinoma: hsa-let-7f-1, hsa-miR-16-1, hsa-miR-152, hsa-miR-217, hsa-miR-18a, hsa-miR-193b, hsa-miR-3136, hsa-let-7g, hsa-miR-155, hsa-miR-3199-1, hsa-miR-219-2, hsa-miR-1254, hsa-miR-1291, hsa-miR-192, hsa-miR-3653, hsa-miR-3934, hsa-miR-342, and hsa-miR-141. Gene ontology annotation and pathway analysis of the miRNA signature revealed its biological significance in cancer and cellular pathways. This miRNA signature could aid in the development of novel therapeutic approaches to the treatment of lung adenocarcinoma.
Project description:Increasing evidence has shown competitive endogenous RNAs (ceRNAs) play key roles in numerous cancers. Nevertheless, the ceRNA network that can predict the prognosis of lung adenocarcinoma (LUAD) is still lacking. The aim of the present study was to identify the prognostic value of key ceRNAs in lung tumorigenesis. Differentially expressed (DE) RNAs were identified between LUAD and adjacent normal samples by limma package in R using The Cancer Genome Atlas database (TCGA). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway function enrichment analysis was performed using the clusterProfiler package in R. Subsequently, the LUAD ceRNA network was established in three steps based on ceRNA hypothesis. Hub RNAs were identified using degree analysis methods based on Cytoscape plugin cytoHubba. Multivariate Cox regression analysis was implemented to calculate the risk score using the candidate ceRNAs and overall survival information. The survival differences between the high-risk and low-risk ceRNA groups were determined by the Kaplan-Meier and log-rank test using survival and survminer package in R. A total of 2,989 mRNAs, 185 lncRNAs, and 153 miRNAs were identified. GO and KEGG pathway function enrichment analysis showed that DE mRNAs were mainly associated with "sister chromatid segregation," "regulation of angiogenesis," "cell adhesion molecules (CAMs)," "cell cycle," and "ECM-receptor interaction." LUAD-related ceRNA network was constructed, which comprised of 54 nodes and 78 edges. Top ten hub RNAs (hsa-miR-374a-5p, hsa-miR-374b-5p, hsa-miR-340-5p, hsa-miR-377-3p, hsa-miR-21-5p, hsa-miR-326, SNHG1, RALGPS2, and PITX2) were identified according to their degree. Kaplan-Meier survival analyses demonstrated that hsa-miR-21-5p and RALGPS2 had a significant prognostic value. Finally, we found that a high risk of three novel ceRNA interactions (SNHG1-hsa-miR-21-5p-RALGPS2, SNHG1-hsa-miR-326-RALGPS2, and SNHG1-hsa-miR-377-3p-RALGPS2) was positively associated with worse prognosis. Three novel ceRNAs (SNHG1-hsa-miR-21-5p-RALGPS2, SNHG1-hsa-miR-326-RALGPS2, and SNHG1-hsa-miR-377-3p-RALGPS2) might be potential biomarkers for the prognosis and treatment of LUAD.
Project description:Background:Competing endogenous RNAs (ceRNAs) are a newly identified type of regulatory RNA. Accumulating evidence suggests that ceRNAs play an important role in the pathogenesis of diseases such as cancer. Thus, ceRNA dysregulation may represent an important molecular mechanism underlying cancer progression and poor prognosis. In this study, we aimed to identify ceRNAs that may serve as potential biomarkers for early diagnosis of lung adenocarcinoma (LUAD). Methods:We performed differential gene expression analysis on TCGA-LUAD datasets to identify differentially expressed (DE) mRNAs, lncRNAs, and miRNAs at different tumor stages. Based on the ceRNA hypothesis and considering the synergistic or feedback regulation of ceRNAs, a lncRNA-miRNA-mRNA network was constructed. Functional analysis was performed using gene ontology term and KEGG pathway enrichment analysis and KOBAS 2.0 software. Transcription factor (TF) analysis was carried out to identify direct targets of the TFs associated with LUAD prognosis. Identified DE genes were validated using gene expression omnibus (GEO) datasets. Results:Based on analysis of TCGA-LUAD datasets, we obtained 2,610 DE mRNAs, 915 lncRNAs, and 125 miRNAs that were common to different tumor stages (|log2(Fold change)| ? 1, false discovery rate < 0.01), respectively. Functional analysis showed that the aberrantly expressed mRNAs were closely related to tumor development. Survival analyses of the constructed ceRNA network modules demonstrated that five of them exhibit prognostic significance. The five ceRNA interaction modules contained one lncRNA (FENDRR), three mRNAs (EPAS1, FOXF1, and EDNRB), and four miRNAs (hsa-miR-148a, hsa-miR-195, hsa-miR-196b, and hsa-miR-301b). The aberrant expression of one lncRNA and three mRNAs was verified in the LUAD GEO dataset. Transcription factor analysis demonstrated that EPAS1 directly targeted 13 DE mRNAs. Conclusion:Our observations indicate that lncRNA-related ceRNAs and TFs play an important role in LUAD. The present study provides novel insights into the molecular mechanisms underlying LUAD pathogenesis. Furthermore, our study facilitates the identification of potential biomarkers for the early diagnosis and prognosis of LUAD and therapeutic targets for its treatment.
Project description:Background:Lung cancer is the most common cancer and the most common cause of cancer-related death worldwide. However, the molecular mechanism of its development is unclear. It is imperative to identify more novel biomarkers. Methods:Two datasets (GSE70880 and GSE113852) were downloaded from the Gene Expression Omnibus (GEO) database and used to identify the differentially expressed genes (DEGs) between lung cancer tissues and normal tissues. Then, we constructed a competing endogenous RNA (ceRNA) network and a protein-protein interaction (PPI) network and performed gene ontology (GO) analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and survival analyses to identify potential biomarkers that are related to the diagnosis and prognosis of lung cancer. Results:A total of 41 lncRNAs and 805 mRNAs were differentially expressed in lung cancer. The ceRNA network contained four lncRNAs (CLDN10-AS1, SFTA1P, SRGAP3-AS2, and ADAMTS9-AS2), 21 miRNAs, and 48 mRNAs. Functional analyses revealed that the genes in the ceRNA network were mainly enriched in cell migration, transmembrane receptor, and protein kinase activity. mRNAs DLGAP5, E2F7, MCM7, RACGAP1, and RRM2 had the highest connectivity in the PPI network. Immunohistochemistry (IHC) demonstrated that mRNAs DLGAP5, MCM7, RACGAP1, and RRM2 were upregulated in lung adenocarcinoma (LUAD). Survival analyses showed that lncRNAs CLDN10-AS1, SFTA1P, and ADAMTS9-AS2 were associated with the prognosis of LUAD. Conclusion:lncRNAs CLDN10-AS1, SFTA1P, and ADAMTS9-AS2 might be the biomarkers of LUAD. For the first time, we confirmed the important role of lncRNA CLDN10-AS1 in LUAD.
Project description:Background Hepatocellular carcinoma (HCC) is a common malignancy worldwide with a high mortality rate. lncRNA SFTA1P is highly expressed in HCC. We aimed to study the role of SFTA1P in HCC and its relationship with miR-4766-5p. Materials and Methods The levels of SFTA1P in HCC tissues and cell lines were determined. Relationship between SFTA1P and clinical features and prognosis was studied. The influence of SFTA1P on HCC cell viability, migration, invasion and apoptosis was studied in vitro. Rescue experiments were conducted after the binding site between SFTA1P and miR-4766-5p confirmed by dual-luciferase assay. The protein expression of AKT, p-AKT, mTOR and p-mTOR in HCC cells with knockdown of SFTA1P was determined by Western blotting. A tumor study in nude mice was conducted in order to assess the effects of SFTA1P on tumor growth characteristics. Results SFTA1P was up-regulated in HCC tissues and cell lines. SFTA1P expression was closely related to tumor size, vascular invasion and TNM stage. Knockdown of SFTA1P inhibited HCC cell viability, migration and invasion and promoted cell apoptosis. MiR-4766-5p was a target of SFTA1P and knockdown of SFTA1P could decrease the protein expression of p-AKT and p-mTOR. Rescue experiments showed that miR-4766-5p mimics could attenuate the promoting role of SFTA1P on HCC cell viability, invasion and migration, and inhibiting role on cell apoptosis. Moreover, we used nude mice models and also found that the knockdown of SFTA1P reduced tumor volume and weight. Conclusion lncRNA SFTA1P could promote tumor development in HCC by down-regulating miR-4766-5p expression via PI3K/AKT/mTOR signaling pathway. It may be a potential therapeutic target for HCC.
Project description:Objective. Here, we aim to investigate the microRNA (miR) profiling in human gastric cancer (GC). Methods. Tumoral and matched peritumoral gastric specimens were collected from 12 GC patients who underwent routine surgery. A high-throughput miR sequencing method was applied to detect the aberrantly expressed miRs in a subset of 6 paired samples. The stem-loop quantitative real-time polymerase chain reaction (qRT-PCR) assay was subsequently performed to confirm the sequencing results in the remaining 6 paired samples. The profiling results were also validated in vitro in three human GC cell lines (BGC-823, MGC-803, and GTL-16) and a normal gastric epithelial cell line (GES-1). Results. The miR sequencing approach detected 5 differentially expressed miRs, hsa-miR-132-3p, hsa-miR-155-5p, hsa-miR-19b-3p, hsa-miR-204-5p, and hsa-miR-30a-3p, which were significantly downmodulated between the tumoral and peritumoral GC tissues. Most of the results were further confirmed by qRT-PCR, while no change was observed for hsa-miR-30a-3p. The in vitro finding also agreed with the results of both miR sequencing and qRT-PCR for hsa-miR-204-5p, hsa-miR-155-5p, and hsa-miR-132-3p. Conclusion. Together, our findings may serve to identify new molecular alterations as well as to enrich the miR profiling in human GC.