Unknown

Dataset Information

0

A Data-Driven Pipeline to Discover Treatment Variations and the Associated Contributing Factors Balanced with Optimal Granularity.


ABSTRACT: Evidence-based medicine utilizes research evidence from clinical trials to support treatment decisions. To leverage the advantage of electronic health records and big data analysis methods, we developed a data-driven analytic pipeline that uses 1) agglomerative hierarchical clustering to define different granularity of treatment variation, 2) feature selection and multinomial multivariate logistic regression analysis to identify variables (factors) associated with treatment variation, and 3) prognosis analysis to compare patient outcome across top treatment groups. We tested our approach on the diffuse large B-cell lymphoma patient population from the MIMIC-IV dataset and found that our approach helps determine the optimal granularity of treatment variation and identify factors associated with treatment variation but not realized in randomized controlled trials due to unbalanced patient cohorts. We also found some patient cohorts' characteristics that could serve to inspire hypothesis generation, such as the influence of ethnicity on the treatment plans and subsequent prognoses.

SUBMITTER: Fan H 

PROVIDER: S-EPMC10148325 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Data-Driven Pipeline to Discover Treatment Variations and the Associated Contributing Factors Balanced with Optimal Granularity.

Fan Hao H   Lim Kian-Huat KH   Yen Po-Yin PY  

AMIA ... Annual Symposium proceedings. AMIA Symposium 20220101


Evidence-based medicine utilizes research evidence from clinical trials to support treatment decisions. To leverage the advantage of electronic health records and big data analysis methods, we developed a data-driven analytic pipeline that uses 1) agglomerative hierarchical clustering to define different granularity of treatment variation, 2) feature selection and multinomial multivariate logistic regression analysis to identify variables (factors) associated with treatment variation, and 3) pro  ...[more]

Similar Datasets

| S-EPMC7525495 | biostudies-literature
| S-EPMC7189366 | biostudies-literature
| S-EPMC7813224 | biostudies-literature
| S-EPMC6855305 | biostudies-literature
| S-EPMC9557800 | biostudies-literature
| S-EPMC5447234 | biostudies-literature
| S-EPMC7727385 | biostudies-literature
| S-EPMC11849655 | biostudies-literature
| S-EPMC9310080 | biostudies-literature
| S-EPMC8405649 | biostudies-literature