Information mapping with pattern classifiers: a comparative study.
ABSTRACT: Information mapping using pattern classifiers has become increasingly popular in recent years, although without a clear consensus on which classifier(s) ought to be used or how results should be tested. This paper addresses each of these questions, both analytically and through comparative analyses on five empirical datasets. We also describe how information maps in multiple class situations can provide information concerning the content of neural representations. Finally, we introduce a publically available software toolbox designed specifically for information mapping.
Project description:The multi-voxel pattern analysis technique is applied to fMRI data for classification of high-level brain functions using pattern information distributed over multiple voxels. In this paper, we propose a classifier ensemble for multiclass classification in fMRI analysis, exploiting the fact that specific neighboring voxels can contain spatial pattern information. The proposed method converts the multiclass classification to a pairwise classifier ensemble, and each pairwise classifier consists of multiple sub-classifiers using an adaptive feature set for each class-pair. Simulated and real fMRI data were used to verify the proposed method. Intra- and inter-subject analyses were performed to compare the proposed method with several well-known classifiers, including single and ensemble classifiers. The comparison results showed that the proposed method can be generally applied to multiclass classification in both simulations and real fMRI analyses.
Project description:Many bioinformatics applications construct classifiers that are validated in experiments that compare their results to known ground truth over a corpus. In this paper, we introduce an approach for exploring the results of such classifier validation experiments, focusing on classifiers for regions of molecular surfaces. We provide a tool that allows for examining classification performance patterns over a test corpus. The approach combines a summary view that provides information about an entire corpus of molecules with a detail view that visualizes classifier results directly on protein surfaces. Rather than displaying miniature 3D views of each molecule, the summary provides 2D glyphs of each protein surface arranged in a reorderable, small-multiples grid. Each summary is specifically designed to support visual aggregation to allow the viewer to both get a sense of aggregate properties as well as the details that form them. The detail view provides a 3D visualization of each protein surface coupled with interaction techniques designed to support key tasks, including spatial aggregation and automated camera touring. A prototype implementation of our approach is demonstrated on protein surface classifier experiments.
Project description:Swimming is a form of physical activity and a life-saving skill. However, only a few studies have identified swimming ability classifiers in preschool children. This pilot cross-sectional study aimed to find anthropometric (AM) and motor competence (MC) predictors of swimming ability in preschool children, by building classifiers of swimming ability group (SAG) membership. We recruited 92 children (girls n = 45) aged 5-6 years and took the AM and MC measurements in accordance with the reference manual and using the KTK battery test (motor quotient, MQ), respectively. A linear discriminant analysis tested a classification model of preschoolers' swimming ability (SAG: POOR, GOOD, EXCELLENT) based on gender, age, AM, and MC variables and extracted one significant canonical discriminant function (model fit: 61.2%) that can differentiate (group centroids) POOR (-1.507), GOOD (0.032), and EXCELLENT (1.524). The MQ total was identified as a significant classifier, which absolutely contributed to the discriminant function that classifies children's swimming ability as POOR (standardized canonical coefficient: 1.186), GOOD (1.363), or EXCELLENT (1.535) with an accuracy of 64.1%. Children with higher MQ total ought to be classified into higher SAG; thus, the classification model of SAG based on the MQ total is presented.
Project description:BACKGROUND: Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. RESULTS: Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. CONCLUSIONS: The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher's discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions.
Project description:For medical classification problems, it is often desirable to have a probability associated with each class. Probabilistic classifiers have received relatively little attention for small n large p classification problems despite of their importance in medical decision making. In this paper, we introduce 2 criteria for assessment of probabilistic classifiers: well-calibratedness and refinement and develop corresponding evaluation measures. We evaluated several published high-dimensional probabilistic classifiers and developed 2 extensions of the Bayesian compound covariate classifier. Based on simulation studies and analysis of gene expression microarray data, we found that proper probabilistic classification is more difficult than deterministic classification. It is important to ensure that a probabilistic classifier is well calibrated or at least not "anticonservative" using the methods developed here. We provide this evaluation for several probabilistic classifiers and also evaluate their refinement as a function of sample size under weak and strong signal conditions. We also present a cross-validation method for evaluating the calibration and refinement of any probabilistic classifier on any data set.
Project description:The number of mutated genes in cancer cells is far larger than the number of mutations that drive cancer. The difficulty this creates for identifying relevant alterations has stimulated the development of various computational approaches to distinguishing drivers from bystanders. We develop and apply an ensemble classifier (EC) machine learning method, which integrates 10 classifiers that are publically available, and apply it to breast and ovarian cancer. In particular we find the following: (1) Using both standard and non-standard metrics, EC almost always outperforms single method classifiers, often by wide margins. (2) Of the 50 highest ranked genes for breast (ovarian) cancer, 34 (30) are associated with other cancers in either the OMIM, CGC or NCG database (P?<?10(-22)). (3) Another 10, for both breast and ovarian cancer, have been identified by GWAS studies. (4) Several of the remaining genes--including a protein kinase that regulates the Fra-1 transcription factor which is overexpressed in ER negative breast cancer cells; and Fyn, which is overexpressed in pancreatic and prostate cancer, among others--are biologically plausible. Biological implications are briefly discussed. Source codes and detailed results are available at http://www.visantnet.org/misi/driver_integration.zip.
Project description:Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21-61%) and 29% (IQR, 15-65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04-5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice.
Project description:Intracortical brain-computer interface (BCI) decoders are typically retrained daily to maintain stable performance. Self-recalibrating decoders aim to remove the burden this may present in the clinic by training themselves autonomously during normal use but have only been developed for continuous control. Here we address the problem for discrete decoding (classifiers).We recorded threshold crossings from 96-electrode arrays implanted in the motor cortex of two rhesus macaques performing center-out reaches in 7 directions over 41 and 36 separate days spanning 48 and 58 days in total for offline analysis.We show that for the purposes of developing a self-recalibrating classifier, tuning parameters can be considered as fixed within days and that parameters on the same electrode move up and down together between days. Further, drift is constrained across time, which is reflected in the performance of a standard classifier which does not progressively worsen if it is not retrained daily, though overall performance is reduced by more than 10% compared to a daily retrained classifier. Two novel self-recalibrating classifiers produce a ~15% increase in classification accuracy over that achieved by the non-retrained classifier to nearly recover the performance of the daily retrained classifier.We believe that the development of classifiers that require no daily retraining will accelerate the clinical translation of BCI systems. Future work should test these results in a closed-loop setting.
Project description:Selecting an appropriate classifier for a particular biological application poses a difficult problem for researchers and practitioners alike. In particular, choosing a classifier depends heavily on the features selected. For high-throughput biomedical datasets, feature selection is often a preprocessing step that gives an unfair advantage to the classifiers built with the same modeling assumptions. In this paper, we seek classifiers that are suitable to a particular problem independent of feature selection. We propose a novel measure, called "win percentage", for assessing the suitability of machine classifiers to a particular problem. We define win percentage as the probability a classifier will perform better than its peers on a finite random sample of feature sets, giving each classifier equal opportunity to find suitable features.First, we illustrate the difficulty in evaluating classifiers after feature selection. We show that several classifiers can each perform statistically significantly better than their peers given the right feature set among the top 0.001% of all feature sets. We illustrate the utility of win percentage using synthetic data, and evaluate six classifiers in analyzing eight microarray datasets representing three diseases: breast cancer, multiple myeloma, and neuroblastoma. After initially using all Gaussian gene-pairs, we show that precise estimates of win percentage (within 1%) can be achieved using a smaller random sample of all feature pairs. We show that for these data no single classifier can be considered the best without knowing the feature set. Instead, win percentage captures the non-zero probability that each classifier will outperform its peers based on an empirical estimate of performance.Fundamentally, we illustrate that the selection of the most suitable classifier (i.e., one that is more likely to perform better than its peers) not only depends on the dataset and application but also on the thoroughness of feature selection. In particular, win percentage provides a single measurement that could assist users in eliminating or selecting classifiers for their particular application.
Project description:Experimental examinations of biofluids to measure concentrations of proteins or their fragments or metabolites are being explored as a means of early disease detection, distinguishing diseases with similar symptoms, and drug treatment efficacy. Many studies have produced classifiers with a high sensitivity and specificity, and it has been argued that accurate results necessarily imply some underlying biology-based features in the classifier. The simplest test of this conjecture is to examine datasets designed to contain no information with classifiers used in many published studies.The classification accuracy of two fingerprint-based classifiers, a decision tree (DT) algorithm and a medoid classification algorithm (MCA), are examined. These methods are used to examine 30 artificial datasets that contain random concentration levels for 300 biomolecules. Each dataset contains between 30 and 300 Cases and Controls, and since the 300 observed concentrations are randomly generated, these datasets are constructed to contain no biological information. A modest search of decision trees containing at most seven decision nodes finds a large number of unique decision trees with an average sensitivity and specificity above 85% for datasets containing 60 Cases and 60 Controls or less, and for datasets with 90 Cases and 90 Controls many DTs have an average sensitivity and specificity above 80%. For even the largest dataset (300 Cases and 300 Controls) the MCA procedure finds several unique classifiers that have an average sensitivity and specificity above 88% using only six or seven features.While it has been argued that accurate classification results must imply some biological basis for the separation of Cases from Controls, our results show that this is not necessarily true. The DT and MCA classifiers are sufficiently flexible and can produce good results from datasets that are specifically constructed to contain no information. This means that a chance fitting to the data is possible. All datasets used in this investigation are available on the web.