Small sample sizes reduce the replicability of task-based fMRI studies.
ABSTRACT: Despite a growing body of research suggesting that task-based functional magnetic resonance imaging (fMRI) studies often suffer from a lack of statistical power due to too-small samples, the proliferation of such underpowered studies continues unabated. Using large independent samples across eleven tasks, we demonstrate the impact of sample size on replicability, assessed at different levels of analysis relevant to fMRI researchers. We find that the degree of replicability for typical sample sizes is modest and that sample sizes much larger than typical (e.g., N = 100) produce results that fall well short of perfectly replicable. Thus, our results join the existing line of work advocating for larger sample sizes. Moreover, because we test sample sizes over a fairly large range and use intuitive metrics of replicability, our hope is that our results are more understandable and convincing to researchers who may have found previous results advocating for larger samples inaccessible.
Project description:Replicability is an important feature of scientific research, but aspects of contemporary research culture, such as an emphasis on novelty, can make replicability seem less important than it should be. The Reproducibility Project: Cancer Biology was set up to provide evidence about the replicability of preclinical research in cancer biology by repeating selected experiments from high-impact papers. A total of 50 experiments from 23 papers were repeated, generating data about the replicability of a total of 158 effects. Most of the original effects were positive effects (136), with the rest being null effects (22). A majority of the original effect sizes were reported as numerical values (117), with the rest being reported as representative images (41). We employed seven methods to assess replicability, and some of these methods were not suitable for all the effects in our sample. One method compared effect sizes: for positive effects, the median effect size in the replications was 85% smaller than the median effect size in the original experiments, and 92% of replication effect sizes were smaller than the original. The other methods were binary - the replication was either a success or a failure - and five of these methods could be used to assess both positive and null effects when effect sizes were reported as numerical values. For positive effects, 40% of replications (39/97) succeeded according to three or more of these five methods, and for null effects 80% of replications (12/15) were successful on this basis; combining positive and null effects, the success rate was 46% (51/112). A successful replication does not definitively confirm an original finding or its theoretical interpretation. Equally, a failure to replicate does not disconfirm a finding, but it does suggest that additional investigation is needed to establish its reliability.
Project description:<h4>Introduction</h4>Reproducibility and replicability of results are rarely achieved for digital biomarkers analyses. We reproduced and replicated previously reported sample size estimates based on digital biomarker and neuropsychological test outcomes in a hypothetical 4-year early-phase Alzheimer's disease trial.<h4>Methods</h4>Original data and newly collected data (using a different motion sensor) came from the Oregon Center for Aging & Technology (ORCATECH). Given trajectories of those with incident mild cognitive impairment and normal cognition would represent trajectories of the control and experimental groups in a hypothetical trial, sample sizes to provide 80% power to detect effect sizes ranging from 20% to 50% were calculated.<h4>Results</h4>For the reproducibility, identical <i>P</i>-values and slope estimates were found with both digital biomarkers and neuropsychological test measures between the previous and current studies. As for the replicability, a greater correlation was found between original and replicated sample size estimates for digital biomarkers (<i>r</i> = 0.87, <i>P</i> < .001) than neuropsychological test outcomes (<i>r</i> = 0.75, <i>P </i>< .001).<h4>Discussion</h4>Reproducibility and replicability of digital biomarker analyses are feasible and encouraged to establish the reliability of findings.
Project description:Because understanding neural vulnerability factors that predict future weight gain may guide the design of more effective obesity prevention programs and treatments, we tested whether neural response to palatable food tastes and images predicted future weight gain. We recruited 135 initially healthy weight adolescents, to reduce the possibility that a history of overeating affected neural responsivity, had them complete fMRI paradigms examining neural response to tastes of milkshakes that varied in fat and sugar content and images of palatable foods, and assessed BMI annually over a 3-year follow-up. We used a novel bootstrapping analytic approach to investigate the replicability of the fMRI findings. Whole-brain analyses indicated that lower response in the pre-supplemental motor area to high-fat/low-sugar milkshake taste predicted future BMI gain in the full sample and in 5 out of the 10 bootstrap samples. Elevated response in the precentral gyrus/Rolandic operculum to images of appetizing foods predicted future BMI gain in the full sample and in 4 out of the 10 bootstrap samples. Other peaks that emerged in the full sample did not replicate in most of the bootstrap samples, suggesting they were not reliable. Region of interest analyses did not replicate the predictive effects of peaks reported in past papers that used similar paradigms, including the evidence that TaqIA polymorphism moderated the relation of striatal response to palatable food tastes to future weight gain. Results suggest that lower responsivity of a region implicated in motor processing in response to palatable taste was associated with greater BMI gain over time, and further that bootstrap sampling may be useful for estimating the replicability of findings that emerge from whole brain analyses or regions of interest analyses with the full sample.
Project description:Linking interindividual differences in psychological phenotype to variations in brain structure is an old dream for psychology and a crucial question for cognitive neurosciences. Yet, replicability of the previously-reported 'structural brain behavior' (SBB)-associations has been questioned, recently. Here, we conducted an empirical investigation, assessing replicability of SBB among heathy adults. For a wide range of psychological measures, the replicability of associations with gray matter volume was assessed. Our results revealed that among healthy individuals 1) finding an association between performance at standard psychological tests and brain morphology is relatively unlikely 2) significant associations, found using an exploratory approach, have overestimated effect sizes and 3) can hardly be replicated in an independent sample. After considering factors such as sample size and comparing our findings with more replicable SBB-associations in a clinical cohort and replicable associations between brain structure and non-psychological phenotype, we discuss the potential causes and consequences of these findings.
Project description:Previous work suggests that key factors for replicability, a necessary feature for theory building, include statistical power and appropriate research planning. These factors are examined by analyzing a collection of 12 standardized meta-analyses on language development between birth and 5 years. With a median effect size of Cohen's d = .45 and typical sample size of 18 participants, most research is underpowered (range = 6%-99%; median = 44%); and calculating power based on seminal publications is not a suitable strategy. Method choice can be improved, as shown in analyses on exclusion rates and effect size as a function of method. The article ends with a discussion on how to increase replicability in both language acquisition studies specifically and developmental research more generally.
Project description:Genome-wide association meta-analysis (GWAMA) is an effective approach to enlarge sample sizes and empower the discovery of novel associations between genotype and phenotype. Independent replication has been used as a gold-standard for validating genetic associations. However, as current GWAMA often seeks to aggregate all available datasets, it becomes impossible to find a large enough independent dataset to replicate new discoveries. Here we introduce a method, MAMBA (Meta-Analysis Model-based Assessment of replicability), for assessing the "posterior-probability-of-replicability" for identified associations by leveraging the strength and consistency of association signals between contributing studies. We demonstrate using simulations that MAMBA is more powerful and robust than existing methods, and produces more accurate genetic effects estimates. We apply MAMBA to a large-scale meta-analysis of addiction phenotypes with 1.2 million individuals. In addition to accurately identifying replicable common variant associations, MAMBA also pinpoints novel replicable rare variant associations from imputation-based GWAMA and hence greatly expands the set of analyzable variants.
Project description:We measure how accurately replication of experimental results can be predicted by black-box statistical models. With data from four large-scale replication projects in experimental psychology and economics, and techniques from machine learning, we train predictive models and study which variables drive predictable replication. The models predicts binary replication with a cross-validated accuracy rate of 70% (AUC of 0.77) and estimates of relative effect sizes with a Spearman ? of 0.38. The accuracy level is similar to market-aggregated beliefs of peer scientists [1, 2]. The predictive power is validated in a pre-registered out of sample test of the outcome of , where 71% (AUC of 0.73) of replications are predicted correctly and effect size correlations amount to ? = 0.25. Basic features such as the sample and effect sizes in original papers, and whether reported effects are single-variable main effects or two-variable interactions, are predictive of successful replication. The models presented in this paper are simple tools to produce cheap, prognostic replicability metrics. These models could be useful in institutionalizing the process of evaluation of new findings and guiding resources to those direct replications that are likely to be most informative.
Project description:Genome-wide association studies (GWAS) have detected many disease associations. However, the reported variants tend to explain small fractions of risk, and there are doubts about issues such as the portability of findings over different ethnic groups or the relative roles of rare versus common variants in the genetic architecture of complex disease. Studying the degree of sharing of disease-associated variants across populations can help in solving these issues. We present a comprehensive survey of GWAS replicability across 28 diseases. Most loci and SNPs discovered in Europeans for these conditions have been extensively replicated using peoples of European and East Asian ancestry, while the replication with individuals of African ancestry is much less common. We found a strong and significant correlation of Odds Ratios across Europeans and East Asians, indicating that underlying causal variants are common and shared between the two ancestries. Moreover, SNPs that failed to replicate in East Asians map into genomic regions where Linkage Disequilibrium patterns differ significantly between populations. Finally, we observed that GWAS with larger sample sizes have detected variants with weaker effects rather than with lower frequencies. Our results indicate that most GWAS results are due to common variants. In addition, the sharing of disease alleles and the high correlation in their effect sizes suggest that most of the underlying causal variants are shared between Europeans and East Asians and that they tend to map close to the associated marker SNPs.
Project description:<h4>Background</h4>Marked sex differences in autism prevalence accentuate the need to understand the role of biological sex-related factors in autism. Efforts to unravel sex differences in the brain organization of autism have, however, been challenged by the limited availability of female data.<h4>Methods</h4>We addressed this gap by using a large sample of males and females with autism and neurotypical (NT) control individuals (ABIDE; Autism: 362 males, 82 females; NT: 409 males, 166 females; 7-18 years). Discovery analyses examined main effects of diagnosis, sex and their interaction across five resting-state fMRI (R-fMRI) metrics (voxel-level Z?>?3.1, cluster-level P?<?0.01, gaussian random field corrected). Secondary analyses assessed the robustness of the results to different pre-processing approaches and their replicability in two independent samples: the EU-AIMS Longitudinal European Autism Project (LEAP) and the Gender Explorations of Neurogenetics and Development to Advance Autism Research.<h4>Results</h4>Discovery analyses in ABIDE revealed significant main effects of diagnosis and sex across the intrinsic functional connectivity of the posterior cingulate cortex, regional homogeneity and voxel-mirrored homotopic connectivity (VMHC) in several cortical regions, largely converging in the default network midline. Sex-by-diagnosis interactions were confined to the dorsolateral occipital cortex, with reduced VMHC in females with autism. All findings were robust to different pre-processing steps. Replicability in independent samples varied by R-fMRI measures and effects with the targeted sex-by-diagnosis interaction being replicated in the larger of the two replication samples-EU-AIMS LEAP.<h4>Limitations</h4>Given the lack of a priori harmonization among the discovery and replication datasets available to date, sample-related variation remained and may have affected replicability.<h4>Conclusions</h4>Atypical cross-hemispheric interactions are neurobiologically relevant to autism. They likely result from the combination of sex-dependent and sex-independent factors with a differential effect across functional cortical networks. Systematic assessments of the factors contributing to replicability are needed and necessitate coordinated large-scale data collection across studies.