Project description:Clinical time-to-event studies are dependent on large sample sizes, often not available at a single institution. However, this is countered by the fact that, particularly in the medical field, individual institutions are often legally unable to share their data, as medical data is subject to strong privacy protection due to its particular sensitivity. But the collection, and especially aggregation into centralized datasets, is also fraught with substantial legal risks and often outright unlawful. Existing solutions using federated learning have already demonstrated considerable potential as an alternative for central data collection. Unfortunately, current approaches are incomplete or not easily applicable in clinical studies owing to the complexity of federated infrastructures. This work presents privacy-aware and federated implementations of the most used time-to-event algorithms (survival curve, cumulative hazard rate, log-rank test, and Cox proportional hazards model) in clinical trials, based on a hybrid approach of federated learning, additive secret sharing, and differential privacy. On several benchmark datasets, we show that all algorithms produce highly similar, or in some cases, even identical results compared to traditional centralized time-to-event algorithms. Furthermore, we were able to reproduce the results of a previous clinical time-to-event study in various federated scenarios. All algorithms are accessible through the intuitive web-app Partea (https://partea.zbh.uni-hamburg.de), offering a graphical user interface for clinicians and non-computational researchers without programming knowledge. Partea removes the high infrastructural hurdles derived from existing federated learning approaches and removes the complexity of execution. Therefore, it is an easy-to-use alternative to central data collection, reducing bureaucratic efforts but also the legal risks associated with the processing of personal data to a minimum.
Project description:In cancer studies the disease natural history process is often observed only at a fixed, random point of diagnosis (a survival time), leading to a current status observation (Sun (2006). The statistical analysis of interval-censored failure time data. Berlin: Springer.) representing a surrogate (a mark) (Jacobsen (2006). Point process theory and applications: marked point and piecewise deterministic processes. Basel: Birkhauser.) attached to the observed survival time. Examples include time to recurrence and stage (local vs. metastatic). We study a simple model that provides insights into the relationship between the observed marked endpoint and the latent disease natural history leading to it. A semiparametric regression model is developed to assess the covariate effects on the observed marked endpoint explained by a latent disease process. The proposed semiparametric regression model can be represented as a transformation model in terms of mark-specific hazards, induced by a process-based mixed effect. Large-sample properties of the proposed estimators are established. The methodology is illustrated by Monte Carlo simulation studies, and an application to a randomized clinical trial of adjuvant therapy for breast cancer.
Project description:Incidence of adverse outcome events rises as patients with advanced illness approach end-of-life. Exposures that tend to occur near end-of-life, for example, use of wheelchair, oxygen therapy and palliative care, may therefore be found associated with the incidence of the adverse outcomes. We propose a concept of reverse time-to-death (rTTD) and its use for the time-scale in time-to-event analysis based on partial likelihood to mitigate the time-varying confounding. We used data on community-based palliative care uptake (exposure) and emergency department visits (outcome) among patients with advanced cancer in Singapore to illustrate. We compare the results against that of the common practice of using time-on-study (TOS) as time-scale. Graphical analysis demonstrated that cancer patients receiving palliative care had higher rate of emergency department visits than non-recipients mainly because they were closer to end-of-life, and that rTTD analysis made comparison between patients at the same time-to-death. In analysis of a decedent cohort, emergency department visits in relation to palliative care using TOS time-scale showed significant increase in hazard ratio estimate when observed time-varying covariates were omitted from statistical adjustment (% change-in-estimate = 16.2%; 95% CI 6.4% to 25.6%). There was no such change in otherwise the same analysis using rTTD (% change-in-estimate = 3.1%; 95% CI -1.0% to 8.5%), demonstrating the ability of rTTD time-scale to mitigate confounding that intensifies in relation to time-to-death. A similar pattern was found in the full cohort. Simulations demonstrated that the proposed method had smaller relative bias and root mean square error than TOS-based analysis. In conclusion, use of rTTD as time-scale in time-to-event analysis provides a simple and robust approach to control time-varying confounding in studies of advanced illness, even if the confounders are unmeasured.
Project description:Small ribozymes such as Oryza sativa twister spontaneously cleave their own RNA when the ribozyme folds into its active conformation. The coupling between twister folding and self-cleavage has been difficult to study, however, because the active ribozyme rapidly converts to product. Here, we describe the synthesis of a photocaged nucleotide that releases guanosine within microseconds upon photosolvolysis with blue light. Application of this tool to O. sativa twister achieved the spatial (75 µm) and temporal (≤30 ms) control required to resolve folding and self-cleavage events when combined with single-molecule fluorescence detection of the ribozyme folding pathway. Real-time observation of single ribozymes after photo-deprotection showed that the precleaved folded state is unstable and quickly unfolds if the RNA does not react. Kinetic analysis showed that Mg2+ and Mn2+ ions increase ribozyme efficiency by making transitions to the high energy active conformation more probable, rather than by stabilizing the folded ground state or the cleaved product. This tool for light-controlled single RNA folding should offer precise and rapid control of other nucleic acid systems.
Project description:Estimating time-dependent rates of speciation and extinction from dated phylogenetic trees of extant species (timetrees), and determining how and why they vary, is key to understanding how ecological and evolutionary processes shape biodiversity. Due to an increasing availability of phylogenetic trees, a growing number of process-based methods relying on the birth-death model have been developed in the last decade to address a variety of questions in macroevolution. However, this methodological progress has regularly been criticized such that one may wonder how reliable the estimations of speciation and extinction rates are. In particular, using lineages-through-time (LTT) plots, a recent study has shown that there are an infinite number of equally likely diversification scenarios that can generate any timetree. This has led to questioning whether or not diversification rates should be estimated at all. Here, we summarize, clarify, and highlight technical considerations on recent findings regarding the capacity of models to disentangle diversification histories. Using simulations, we illustrate the characteristics of newly proposed "pulled rates" and their utility. We recognize that the recent findings are a step forward in understanding the behavior of macroevolutionary modeling, but they in no way suggest we should abandon diversification modeling altogether. On the contrary, the study of macroevolution using phylogenetic trees has never been more exciting and promising than today. We still face important limitations in regard to data availability and methods, but by acknowledging them we can better target our joint efforts as a scientific community. [Birth-death models; extinction; phylogenetics; speciation.].
Project description:For case-control studies that rely on expensive assays for biomarkers, specimen pooling offers a cost-effective and efficient way to estimate individual-level odds ratios. Pooling helps to conserve irreplaceable biospecimens for the future, mitigates limit-of-detection problems, and enables inclusion of individuals who have limited available volumes of biospecimen. Pooling can also allow the study of a panel of biomarkers under a fixed assay budget. Here, we extend this method for application to discrete-time survival studies. Assuming a proportional odds logistic model for risk of a common outcome, we propose a design strategy that forms pooling sets within those experiencing the outcome at the same event time. We show that the proposed design enables a cost-effective analysis to assess the association of a biomarker with the outcome. Because the standard likelihood is slightly misspecified for the proposed pooling strategy under a nonnull biomarker effect, the proposed approach produces slightly biased estimates of exposure odds ratios. We explore the extent of this bias via simulations and illustrate the method by revisiting a data set relating polychlorinated biphenyls and 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene to time to pregnancy.
Project description:BackgroundIntegrating functional annotations into SNP-set association studies has been proven a powerful analysis strategy. Statistical methods for such integration have been developed for continuous and binary phenotypes; however, the SNP-set integrative approaches for time-to-event or survival outcomes are lacking.MethodsWe here propose IEHC, an integrative eQTL (expression quantitative trait loci) hierarchical Cox regression, for SNP-set based survival association analysis by modeling effect sizes of genetic variants as a function of eQTL via a hierarchical manner. Three p-values combination tests are developed to examine the joint effects of eQTL and genetic variants after a novel decorrelated modification of statistics for the two components. An omnibus test (IEHC-ACAT) is further adapted to aggregate the strengths of all available tests.ResultsSimulations demonstrated that the IEHC joint tests were more powerful if both eQTL and genetic variants contributed to association signal, while IEHC-ACAT was robust and often outperformed other approaches across various simulation scenarios. When applying IEHC to ten TCGA cancers by incorporating eQTL from relevant tissues of GTEx, we revealed that substantial correlations existed between the two types of effect sizes of genetic variants from TCGA and GTEx, and identified 21 (9 unique) cancer-associated genes which would otherwise be missed by approaches not incorporating eQTL.ConclusionIEHC represents a flexible, robust, and powerful approach to integrate functional omics information to enhance the power of identifying association signals for the survival risk of complex human cancers.
Project description:Acute graft-versus-host disease (GVHD) is a frequent complication following hematopoietic cell transplantation (HCT). Research on risk factors for acute GVHD has tended to ignore two important clinical issues. First, post-transplant mortality is high. In our motivating data, 100-day post-HCT mortality was 15.4%. Second, acute GVHD in its classic form is only diagnosed within 100 days of the transplant; beyond 100 days, a patient may be diagnosed with late onset acute or chronic GVHD. Standard modeling of time-to-event outcomes, however, generally conceive of patients being able to experience the event at any point on the time scale. In this paper, we propose a novel multi-state model that simultaneously: (i) accounts for mortality through joint modeling of acute GVHD and death, and (ii) explicitly acknowledges the finite time interval during which the event of interest can take place. The observed data likelihood is derived, with estimation and inference via maximum likelihood. Additionally, we provide methods for estimating the absolute risk of acute GVHD and death simultaneously. The proposed framework is compared via comprehensive simulations to a number of alternative approaches that each acknowledge some but not all aspects of acute GVHD, and illustrated with an analysis of HCT data that motivated this work.
Project description:It is widely recommended that any developed-diagnostic or prognostic-prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package "metamisc".