Structured sparsity regularized multiple kernel learning for Alzheimer's disease diagnosis.
ABSTRACT: Multimodal data fusion has shown great advantages in uncovering information that could be overlooked by using single modality. In this paper, we consider the integration of high-dimensional multi-modality imaging and genetic data for Alzheimer's disease (AD) diagnosis. With a focus on taking advantage of both phenotype and genotype information, a novel structured sparsity, defined by ? 1, p-norm (p > 1), regularized multiple kernel learning method is designed. Specifically, to facilitate structured feature selection and fusion from heterogeneous modalities and also capture feature-wise importance, we represent each feature with a distinct kernel as a basis, followed by grouping the kernels according to modalities. Then, an optimally combined kernel presentation of multimodal features is learned in a data-driven approach. Contrary to the Group Lasso (i.e., ? 2, 1-norm penalty) which performs sparse group selection, the proposed regularizer enforced on kernel weights is to sparsely select concise feature set within each homogenous group and fuse the heterogeneous feature groups by taking advantage of dense norms. We have evaluated our method using data of subjects from Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The effectiveness of the method is demonstrated by the clearly improved prediction diagnosis and also the discovered brain regions and SNPs relevant to AD.
Project description:Multimodality based methods have shown great advantages in classification of Alzheimer's disease (AD) and its prodromal stage, that is, mild cognitive impairment (MCI). Recently, multitask feature selection methods are typically used for joint selection of common features across multiple modalities. However, one disadvantage of existing multimodality based methods is that they ignore the useful data distribution information in each modality, which is essential for subsequent classification. Accordingly, in this paper we propose a manifold regularized multitask feature learning method to preserve both the intrinsic relatedness among multiple modalities of data and the data distribution information in each modality. Specifically, we denote the feature learning on each modality as a single task, and use group-sparsity regularizer to capture the intrinsic relatedness among multiple tasks (i.e., modalities) and jointly select the common features from multiple tasks. Furthermore, we introduce a new manifold-based Laplacian regularizer to preserve the data distribution information from each task. Finally, we use the multikernel support vector machine method to fuse multimodality data for eventual classification. Conversely, we also extend our method to the semisupervised setting, where only partial data are labeled. We evaluate our method using the baseline magnetic resonance imaging (MRI), fluorodeoxyglucose positron emission tomography (FDG-PET), and cerebrospinal fluid (CSF) data of subjects from AD neuroimaging initiative database. The experimental results demonstrate that our proposed method can not only achieve improved classification performance, but also help to discover the disease-related brain regions useful for disease diagnosis.
Project description:Effective and accurate diagnosis of Alzheimer's disease (AD), as well as its prodromal stage (i.e., mild cognitive impairment (MCI)), has attracted more and more attention recently. So far, multiple biomarkers have been shown to be sensitive to the diagnosis of AD and MCI, i.e., structural MR imaging (MRI) for brain atrophy measurement, functional imaging (e.g., FDG-PET) for hypometabolism quantification, and cerebrospinal fluid (CSF) for quantification of specific proteins. However, most existing research focuses on only a single modality of biomarkers for diagnosis of AD and MCI, although recent studies have shown that different biomarkers may provide complementary information for the diagnosis of AD and MCI. In this paper, we propose to combine three modalities of biomarkers, i.e., MRI, FDG-PET, and CSF biomarkers, to discriminate between AD (or MCI) and healthy controls, using a kernel combination method. Specifically, ADNI baseline MRI, FDG-PET, and CSF data from 51AD patients, 99 MCI patients (including 43 MCI converters who had converted to AD within 18 months and 56 MCI non-converters who had not converted to AD within 18 months), and 52 healthy controls are used for development and validation of our proposed multimodal classification method. In particular, for each MR or FDG-PET image, 93 volumetric features are extracted from the 93 regions of interest (ROIs), automatically labeled by an atlas warping algorithm. For CSF biomarkers, their original values are directly used as features. Then, a linear support vector machine (SVM) is adopted to evaluate the classification accuracy, using a 10-fold cross-validation. As a result, for classifying AD from healthy controls, we achieve a classification accuracy of 93.2% (with a sensitivity of 93% and a specificity of 93.3%) when combining all three modalities of biomarkers, and only 86.5% when using even the best individual modality of biomarkers. Similarly, for classifying MCI from healthy controls, we achieve a classification accuracy of 76.4% (with a sensitivity of 81.8% and a specificity of 66%) for our combined method, and only 72% even using the best individual modality of biomarkers. Further analysis on MCI sensitivity of our combined method indicates that 91.5% of MCI converters and 73.4% of MCI non-converters are correctly classified. Moreover, we also evaluate the classification performance when employing a feature selection method to select the most discriminative MR and FDG-PET features. Again, our combined method shows considerably better performance, compared to the case of using an individual modality of biomarkers.
Project description:This study investigates the prediction of mild cognitive impairment-to-Alzheimer's disease (MCI-to-AD) conversion based on extensive multimodal data with varying degrees of missing values.Based on Alzheimer's Disease Neuroimaging Initiative data from MCI-patients including all available modalities, we predicted the conversion to AD within 3 years. Different ways of replacing missing data in combination with different classification algorithms are compared. The performance was evaluated on features prioritized by experts and automatically selected features.The conversion to AD could be predicted with a maximal accuracy of 73% using support vector machines and features chosen by experts. Among data modalities, neuropsychological, magnetic resonance imaging, and positron emission tomography data were most informative. The best single feature was the functional activities questionnaire.Extensive multimodal and incomplete data can be adequately handled by a combination of missing data substitution, feature selection, and classification.
Project description:Multiple Kernel Learning (MKL) generalizes SVMs to the setting where one simultaneously trains a linear classifier and chooses an optimal combination of given base kernels. Model complexity is typically controlled using various norm regularizations on the base kernel mixing coefficients. Existing methods neither regularize nor exploit potentially useful information pertaining to how kernels in the input set 'interact'; that is, higher order kernel-pair relationships that can be easily obtained via unsupervised (similarity, geodesics), supervised (correlation in errors), or domain knowledge driven mechanisms (which features were used to construct the kernel?). We show that by substituting the norm penalty with an arbitrary quadratic function Q 0, one can impose a desired covariance structure on mixing weights, and use this as an inductive bias when learning the concept. This formulation significantly generalizes the widely used 1- and 2-norm MKL objectives. We explore the model's utility via experiments on a challenging Neuroimaging problem, where the goal is to predict a subject's conversion to Alzheimer's Disease (AD) by exploiting aggregate information from many distinct imaging modalities. Here, our new model outperforms the state of the art (p-values ? 10(-3)). We briefly discuss ramifications in terms of learning bounds (Rademacher complexity).
Project description:Graphical, voxel, and region-based analysis has become a popular approach to studying neurodegenerative disorders such as Alzheimer's disease (AD) and its prodromal stage [mild cognitive impairment (MCI)]. These methods have been used previously for classification or discrimination of AD in subjects in a prodromal stage called stable MCI (MCIs), which does not convert to AD but remains stable over a period of time, and converting MCI (MCIc), which converts to AD, but the results reported across similar studies are often inconsistent. Furthermore, the classification accuracy for MCIs vs. MCIc is limited. In this study, we propose combining different neuroimaging modalities (sMRI, FDG-PET, AV45-PET, DTI, and rs-fMRI) with the apolipoprotein-E genotype to form a multimodal system for the discrimination of AD, and to increase the classification accuracy. Initially, we used two well-known analyses to extract features from each neuroimage for the discrimination of AD: whole-brain parcelation analysis (or region-based analysis), and voxel-wise analysis (or voxel-based morphometry). We also investigated graphical analysis (nodal and group) for all six binary classification groups (AD vs. HC, MCIs vs. MCIc, AD vs. MCIc, AD vs. MCIs, HC vs. MCIc, and HC vs. MCIs). Data for a total of 129 subjects (33 AD, 30 MCIs, 31 MCIc, and 35 HCs) for each imaging modality were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) homepage. These data also include two APOE genotype data points for the subjects. Moreover, we used the 2-mm AICHA atlas with the NiftyReg registration toolbox to extract 384 brain regions from each PET (FDG and AV45) and sMRI image. For the rs-fMRI images, we used the DPARSF toolbox in MATLAB for the automatic extraction of data and the results for REHO, ALFF, and fALFF. We also used the pyClusterROI script for the automatic parcelation of each rs-fMRI image into 200 brain regions. For the DTI images, we used the FSL (Version 6.0) toolbox for the extraction of fractional anisotropy (FA) images to calculate a tract-based spatial statistic. Moreover, we used the PANDA toolbox to obtain 50 white-matter-region-parcellated FA images on the basis of the 2-mm JHU-ICBM-labeled template atlas. To integrate the different modalities and different complementary information into one form, and to optimize the classifier, we used the multiple kernel learning (MKL) framework. The obtained results indicated that our multimodal approach yields a significant improvement in accuracy over any single modality alone. The areas under the curve obtained by the proposed method were 97.78, 96.94, 95.56, 96.25, 96.67, and 96.59% for AD vs. HC, MCIs vs. MCIc, AD vs. MCIc, AD vs. MCIs, HC vs. MCIc, and HC vs. MCIs binary classification, respectively. Our proposed multimodal method improved the classification result for MCIs vs. MCIc groups compared with the unimodal classification results. Our study found that the (left/right) precentral region was present in all six binary classification groups (this region can be considered the most significant region). Furthermore, using nodal network topology, we found that FDG, AV45-PET, and rs-fMRI were the most important neuroimages, and showed many affected regions relative to other modalities. We also compared our results with recently published results.
Project description:Alzheimer's disease (AD), including its mild cognitive impairment (MCI) phase that may or may not progress into the AD, is the most ordinary form of dementia. It is extremely important to correctly identify patients during the MCI stage because this is the phase where AD may or may not develop. Thus, it is crucial to predict outcomes during this phase. Thus far, many researchers have worked on only using a single modality of a biomarker for the diagnosis of AD or MCI. Although recent studies show that a combination of one or more different biomarkers may provide complementary information for the diagnosis, it also increases the classification accuracy distinguishing between different groups. In this paper, we propose a novel machine learning-based framework to discriminate subjects with AD or MCI utilizing a combination of four different biomarkers: fluorodeoxyglucose positron emission tomography (FDG-PET), structural magnetic resonance imaging (sMRI), cerebrospinal fluid (CSF) protein levels, and Apolipoprotein-E (APOE) genotype. The Alzheimer's Disease Neuroimaging Initiative (ADNI) baseline dataset was used in this study. In total, there were 158 subjects for whom all four modalities of biomarker were available. Of the 158 subjects, 38 subjects were in the AD group, 82 subjects were in MCI groups (including 46 in MCIc [MCI converted; conversion to AD within 24 months of time period], and 36 in MCIs [MCI stable; no conversion to AD within 24 months of time period]), and the remaining 38 subjects were in the healthy control (HC) group. For each image, we extracted 246 regions of interest (as features) using the Brainnetome template image and NiftyReg toolbox, and later we combined these features with three CSF and two APOE genotype features obtained from the ADNI website for each subject using early fusion technique. Here, a different kernel-based multiclass support vector machine (SVM) classifier with a grid-search method was applied. Before passing the obtained features to the classifier, we have used truncated singular value decomposition (Truncated SVD) dimensionality reduction technique to reduce high dimensional features into a lower-dimensional feature. As a result, our combined method achieved an area under the receiver operating characteristic (AU-ROC) curve of 98.33, 93.59, 96.83, 94.64, 96.43, and 95.24% for AD vs. HC, MCIs vs. MCIc, AD vs. MCIs, AD vs. MCIc, HC vs. MCIc, and HC vs. MCIs subjects which are high relative to single modality results and other state-of-the-art approaches. Moreover, combined multimodal methods have improved the classification performance over the unimodal classification.
Project description:Different modalities such as structural MRI, FDG-PET, and CSF have complementary information, which is likely to be very useful for diagnosis of AD and MCI. Therefore, it is possible to develop a more effective and accurate AD/MCI automatic diagnosis method by integrating complementary information of different modalities. In this paper, we propose multi-modal sparse hierarchical extreme leaning machine (MSH-ELM). We used volume and mean intensity extracted from 93 regions of interest (ROIs) as features of MRI and FDG-PET, respectively, and used p-tau, t-tau, and A?42 as CSF features. In detail, high-level representation was individually extracted from each of MRI, FDG-PET, and CSF using a stacked sparse extreme learning machine auto-encoder (sELM-AE). Then, another stacked sELM-AE was devised to acquire a joint hierarchical feature representation by fusing the high-level representations obtained from each modality. Finally, we classified joint hierarchical feature representation using a kernel-based extreme learning machine (KELM). The results of MSH-ELM were compared with those of conventional ELM, single kernel support vector machine (SK-SVM), multiple kernel support vector machine (MK-SVM) and stacked auto-encoder (SAE). Performance was evaluated through 10-fold cross-validation. In the classification of AD vs. HC and MCI vs. HC problem, the proposed MSH-ELM method showed mean balanced accuracies of 96.10% and 86.46%, respectively, which is much better than those of competing methods. In summary, the proposed algorithm exhibits consistently better performance than SK-SVM, ELM, MK-SVM and SAE in the two binary classification problems (AD vs. HC and MCI vs. HC).
Project description:Alzheimer's Disease (AD) and other neurodegenerative diseases affect over 20 million people worldwide, and this number is projected to significantly increase in the coming decades. Proposed imaging-based markers have shown steadily improving levels of sensitivity/specificity in classifying individual subjects as AD or normal. Several of these efforts have utilized statistical machine learning techniques, using brain images as input, as means of deriving such AD-related markers. A common characteristic of this line of research is a focus on either (1) using a single imaging modality for classification, or (2) incorporating several modalities, but reporting separate results for each. One strategy to improve on the success of these methods is to leverage all available imaging modalities together in a single automated learning framework. The rationale is that some subjects may show signs of pathology in one modality but not in another-by combining all available images a clearer view of the progression of disease pathology will emerge. Our method is based on the Multi-Kernel Learning (MKL) framework, which allows the inclusion of an arbitrary number of views of the data in a maximum margin, kernel learning framework. The principal innovation behind MKL is that it learns an optimal combination of kernel (similarity) matrices while simultaneously training a classifier. In classification experiments MKL outperformed an SVM trained on all available features by 3%-4%. We are especially interested in whether such markers are capable of identifying early signs of the disease. To address this question, we have examined whether our multi-modal disease marker (MMDM) can predict conversion from Mild Cognitive Impairment (MCI) to AD. Our experiments reveal that this measure shows significant group differences between MCI subjects who progressed to AD, and those who remained stable for 3 years. These differences were most significant in MMDMs based on imaging data. We also discuss the relationship between our MMDM and an individual's conversion from MCI to AD.
Project description:Identifying patients with mild cognitive impairment (MCI) who are at high risk of progressing to Alzheimer's disease (AD) is crucial for early treatment of AD. However, it is difficult to predict the cognitive states of patients. This study developed an extreme learning machine (ELM)-based grading method to efficiently fuse multimodal data and predict MCI-to-AD conversion. First, features were extracted from magnetic resonance (MR) images, and useful features were selected using a feature selection method. Second, multiple modalities of MCI subjects, including MRI, positron emission tomography, cerebrospinal fluid biomarkers, and gene data, were individually graded using the ELM method. Finally, these grading scores calculated from different modalities were fed into a classifier to discriminate subjects with progressive MCI from those with stable MCI. The proposed approach has been validated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort, and an accuracy of 84.7% was achieved for an AD prediction within 3 years. Experiments on predicting AD conversion from MCI within different periods showed similar results with the 3-year prediction. The experimental results demonstrate that the proposed approach benefits from the efficient fusion of four modalities, resulting in an accurate prediction of MCI-to-AD conversion.
Project description:Neurodegenerative disorders, such as Alzheimer's disease, are associated with changes in multiple neuroimaging and biological measures. These may provide complementary information for diagnosis and prognosis. We present a multi-modality classification framework in which manifolds are constructed based on pairwise similarity measures derived from random forest classifiers. Similarities from multiple modalities are combined to generate an embedding that simultaneously encodes information about all the available features. Multi-modality classification is then performed using coordinates from this joint embedding. We evaluate the proposed framework by application to neuroimaging and biological data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Features include regional MRI volumes, voxel-based FDG-PET signal intensities, CSF biomarker measures, and categorical genetic information. Classification based on the joint embedding constructed using information from all four modalities out-performs the classification based on any individual modality for comparisons between Alzheimer's disease patients and healthy controls, as well as between mild cognitive impairment patients and healthy controls. Based on the joint embedding, we achieve classification accuracies of 89% between Alzheimer's disease patients and healthy controls, and 75% between mild cognitive impairment patients and healthy controls. These results are comparable with those reported in other recent studies using multi-kernel learning. Random forests provide consistent pairwise similarity measures for multiple modalities, thus facilitating the combination of different types of feature data. We demonstrate this by application to data in which the number of features differs by several orders of magnitude between modalities. Random forest classifiers extend naturally to multi-class problems, and the framework described here could be applied to distinguish between multiple patient groups in the future.