Project description:Gene expression profiles of clinical cohorts can be used to identify genes that are correlated with a clinical variable of interest such as patient outcome or response to a particular drug. However, expression measurements are susceptible to technical bias caused by variation in extraneous factors such as RNA quality and array hybridization conditions. If such technical bias is correlated with the clinical variable of interest, the likelihood of identifying false positive genes is increased. Here we describe a method to visualize an expression matrix as a projection of all genes onto a plane defined by a clinical variable and a technical nuisance variable. The resulting plot indicates the extent to which each gene is correlated with the clinical variable or the technical variable. We demonstrate this method by applying it to three clinical trial microarray data sets, one of which identified genes that may have been driven by a confounding technical variable. This approach can be used as a quality control step to identify data sets that are likely to yield false positive results.
Project description:BackgroundAdjusting for multiple biases usually involves adjusting for one bias at a time, with careful attention to the order in which these biases are adjusted. A novel, alternative approach to multiple-bias adjustment involves the simultaneous adjustment of all biases via imputation and/or regression weighting. The imputed value or weight corresponds to the probability of the missing data and serves to 'reconstruct' the unbiased data that would be observed based on the provided assumptions of the degree of bias.MethodsWe motivate and describe the steps necessary to implement this method. We also demonstrate the validity of this method through a simulation study with an exposure-outcome relationship that is biased by uncontrolled confounding, exposure misclassification, and selection bias.ResultsThe study revealed that a non-biased effect estimate can be obtained when correct bias parameters are applied. It also found that incorrect specification of every bias parameter by +/-25% still produced an effect estimate with less bias than the observed, biased effect.ConclusionsSimultaneous multi-bias analysis is a useful way of investigating and understanding how multiple sources of bias may affect naive effect estimates. This new method can be used to enhance the validity and transparency of real-world evidence obtained from observational, longitudinal studies.
Project description:Purpose of reviewViolence prevention research has enhanced our understanding of individual and community risk and protective factors for aggression and violence. However, our knowledge of risk and protective factors for violence is highly dependent on observational studies, since there are few randomized trials of risk and protective factors for violence. Observational studies are susceptible to systematic errors, specifically confounding, and may lack internal validity.Recent findingsMany violence prevention studies utilize methods that do not correctly identify the set of covariates needed for statistical adjustment. This results in unwarranted matching and restriction leading to further confounding or selection bias. Covariate adjustment based on purely statistical criteria generates inconsistent results and uncertain conclusions.SummaryConventional methods used to identify confounding in violence prevention research are often inadequate. Causal diagrams have potential to improve the understanding and identification of potential confounding biases in observational violence prevention studies, and methods like sensitivity analysis using quantitative bias analysis can help to address unmeasured confounding. Violence research studies should make more use of these methods.
Project description:BackgroundThe lack of nonparametric statistical tests for confounding bias significantly hampers the development of robust, valid, and generalizable predictive models in many fields of research. Here I propose the partial confounder test, which, for a given confounder variable, probes the null hypotheses of the model being unconfounded.ResultsThe test provides a strict control for type I errors and high statistical power, even for nonnormally and nonlinearly dependent predictions, often seen in machine learning. Applying the proposed test on models trained on large-scale functional brain connectivity data (N= 1,865) (i) reveals previously unreported confounders and (ii) shows that state-of-the-art confound mitigation approaches may fail preventing confounder bias in several cases.ConclusionsThe proposed test (implemented in the package mlconfound; https://mlconfound.readthedocs.io) can aid the assessment and improvement of the generalizability and validity of predictive models and, thereby, fosters the development of clinically useful machine learning biomarkers.
Project description:Drawing causal inference with observational studies is the central pillar of many disciplines. One sufficient condition for identifying the causal effect is that the treatment-outcome relationship is unconfounded conditional on the observed covariates. It is often believed that the more covariates we condition on, the more plausible this unconfoundedness assumption is. This belief has had a huge impact on practical causal inference, suggesting that we should adjust for all pretreatment covariates. However, when there is unmeasured confounding between the treatment and outcome, estimators adjusting for some pretreatment covariate might have greater bias than estimators without adjusting for this covariate. This kind of covariate is called a bias amplifier, and includes instrumental variables that are independent of the confounder, and affect the outcome only through the treatment. Previously, theoretical results for this phenomenon have been established only for linear models. We fill in this gap in the literature by providing a general theory, showing that this phenomenon happens under a wide class of models satisfying certain monotonicity assumptions. We further show that when the treatment follows an additive or multiplicative model conditional on the instrumental variable and the confounder, these monotonicity assumptions can be interpreted as the signs of the arrows of the causal diagrams.
Project description:BackgroundConfounding bias is a common concern in epidemiological research. Its presence is often determined by comparing exposure effects between univariable- and multivariable regression models, using an arbitrary threshold of a 10% difference to indicate confounding bias. However, many clinical researchers are not aware that the use of this change-in-estimate criterion may lead to wrong conclusions when applied to logistic regression coefficients. This is due to a statistical phenomenon called noncollapsibility, which manifests itself in logistic regression models. This paper aims to clarify the role of noncollapsibility in logistic regression and to provide guidance in determining the presence of confounding bias.MethodsA Monte Carlo simulation study was designed to uncover patterns of confounding bias and noncollapsibility effects in logistic regression. An empirical data example was used to illustrate the inability of the change-in-estimate criterion to distinguish confounding bias from noncollapsibility effects.ResultsThe simulation study showed that, depending on the sign and magnitude of the confounding bias and the noncollapsibility effect, the difference between the effect estimates from univariable- and multivariable regression models may underestimate or overestimate the magnitude of the confounding bias. Because of the noncollapsibility effect, multivariable regression analysis and inverse probability weighting provided different but valid estimates of the confounder-adjusted exposure effect. In our data example, confounding bias was underestimated by the change in estimate due to the presence of a noncollapsibility effect.ConclusionIn logistic regression, the difference between the univariable- and multivariable effect estimate might not only reflect confounding bias but also a noncollapsibility effect. Ideally, the set of confounders is determined at the study design phase and based on subject matter knowledge. To quantify confounding bias, one could compare the unadjusted exposure effect estimate and the estimate from an inverse probability weighted model.
Project description:Recommendations for reporting instrumental variable analyses often include presenting the balance of covariates across levels of the proposed instrument and levels of the treatment. However, such presentation can be misleading as relatively small imbalances among covariates across levels of the instrument can result in greater bias because of bias amplification. We introduce bias plots and bias component plots as alternative tools for understanding biases in instrumental variable analyses. Using previously published data on proposed preference-based, geography-based, and distance-based instruments, we demonstrate why presenting covariate balance alone can be problematic, and how bias component plots can provide more accurate context for bias from omitting a covariate from an instrumental variable versus non-instrumental variable analysis. These plots can also provide relevant comparisons of different proposed instruments considered in the same data. Adaptable code is provided for creating the plots.
Project description:We present results that allow the researcher in certain cases to determine the direction of the bias that arises when control for confounding is inadequate. The results are given within the context of the directed acyclic graph causal framework and are stated in terms of signed edges. Rigorous definitions for signed edges are provided. We describe cases in which intuition concerning signed edges fails and we characterize the directed acyclic graphs that researchers can use to draw conclusions about the sign of the bias of unmeasured confounding. If there is only one unmeasured confounding variable on the graph, then nonincreasing or nondecreasing average causal effects suffice to draw conclusions about the direction of the bias. When there are more than one unmeasured confounding variable, nonincreasing and nondecreasing average causal effects can be used to draw conclusions only if the various unmeasured confounding variables are independent of one another conditional on the measured covariates. When this conditional independence property does not hold, stronger notions of monotonicity are needed to draw conclusions about the direction of the bias.
Project description:BackgroundAssociations between attention-deficit/hyperactivity disorder (ADHD) and brain morphology have been reported, although with several inconsistencies. These may partly stem from confounding bias, which could distort associations and limit generalizability. We examined how associations between brain morphology and ADHD symptoms change with adjustments for potential confounders typically overlooked in the literature (aim 1), and for the intelligence quotient (IQ) and head motion, which are generally corrected for but play ambiguous roles (aim 2).MethodsParticipants were 10-year-old children from the Adolescent Brain Cognitive Development (N = 7722) and Generation R (N = 2531) Studies. Cortical area, volume, and thickness were measured with MRI and ADHD symptoms with the Child Behavior Checklist. Surface-based cross-sectional analyses were run.ResultsADHD symptoms related to widespread cortical regions when solely adjusting for demographic factors. Additional adjustments for socioeconomic and maternal behavioral confounders (aim 1) generally attenuated associations, as cluster sizes halved and effect sizes substantially reduced. Cluster sizes further changed when including IQ and head motion (aim 2), however, we argue that adjustments might have introduced bias.ConclusionsCareful confounder selection and control can help identify more robust and specific regions of associations for ADHD symptoms, across two cohorts. We provided guidance to minimizing confounding bias in psychiatric neuroimaging.FundingAuthors are supported by an NWO-VICI grant (NWO-ZonMW: 016.VICI.170.200 to HT) for HT, LDA, SL, and the Sophia Foundation S18-20, and Erasmus University and Erasmus MC Fellowship for RLM.