Unknown

Dataset Information

0

Analysis of cancer gene expression data with an assisted robust marker identification approach.


ABSTRACT: Gene expression (GE) studies have been playing a critical role in cancer research. Despite tremendous effort, the analysis results are still often unsatisfactory, because of the weak signals and high data dimensionality. Analysis is often further challenged by the long-tailed distributions of the outcome variables. In recent multidimensional studies, data have been collected on GEs as well as their regulators (e.g., copy number alterations (CNAs), methylation, and microRNAs), which can provide additional information on the associations between GEs and cancer outcomes. In this study, we develop an ARMI (assisted robust marker identification) approach for analyzing cancer studies with measurements on GEs as well as regulators. The proposed approach borrows information from regulators and can be more effective than analyzing GE data alone. A robust objective function is adopted to accommodate long-tailed distributions. Marker identification is effectively realized using penalization. The proposed approach has an intuitive formulation and is computationally much affordable. Simulation shows its satisfactory performance under a variety of settings. TCGA (The Cancer Genome Atlas) data on melanoma and lung cancer are analyzed, which leads to biologically plausible marker identification and superior prediction.

SUBMITTER: Chai H 

PROVIDER: S-EPMC5912176 | biostudies-literature | 2017 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Analysis of cancer gene expression data with an assisted robust marker identification approach.

Chai Hao H   Shi Xingjie X   Zhang Qingzhao Q   Zhao Qing Q   Huang Yuan Y   Ma Shuangge S  

Genetic epidemiology 20170914 8


Gene expression (GE) studies have been playing a critical role in cancer research. Despite tremendous effort, the analysis results are still often unsatisfactory, because of the weak signals and high data dimensionality. Analysis is often further challenged by the long-tailed distributions of the outcome variables. In recent multidimensional studies, data have been collected on GEs as well as their regulators (e.g., copy number alterations (CNAs), methylation, and microRNAs), which can provide a  ...[more]

Similar Datasets

| S-EPMC3118357 | biostudies-literature
| S-EPMC4409222 | biostudies-literature
| S-EPMC6016759 | biostudies-literature
| S-EPMC6660050 | biostudies-literature
| S-EPMC5740711 | biostudies-literature
| S-EPMC1855059 | biostudies-literature
| S-EPMC7285394 | biostudies-literature
| S-EPMC8850484 | biostudies-literature
| S-EPMC10709285 | biostudies-literature
| S-EPMC6858347 | biostudies-literature