Dataset Information

Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source

ABSTRACT:

SUBMITTER: Marelli A

PROVIDER: S-EPMC11198709 | biostudies-literature | 2023 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Similar Datasets

Project description:Techniques of data mining and machine learning were applied to a large database of medical and facility claims from commercially insured patients to determine the prevalence, gender demographics, and costs for individuals with provider-assigned diagnosis codes for myalgic encephalomyelitis (ME) or chronic fatigue syndrome (CFS). The frequency of diagnosis was 519-1,038/100,000 with the relative risk of females being diagnosed with ME or CFS compared to males 1.238 and 1.178, respectively. While the percentage of women diagnosed with ME/CFS is higher than the percentage of men, ME/CFS is not a "women's disease." Thirty-five to forty percent of diagnosed patients are men. Extrapolating from this frequency of diagnosis and based on the estimated 2017 population of the United States, a rough estimate for the number of patients who may be diagnosed with ME or CFS in the U.S. is 1.7 million to 3.38 million. Patients diagnosed with CFS appear to represent a more heterogeneous group than those diagnosed with ME. A machine learning model based on characteristics of individuals diagnosed with ME was developed and applied, resulting in a predicted prevalence of 857/100,000 (p > 0.01), or roughly 2.8 million in the U.S. Average annual costs for individuals with a diagnosis of ME or CFS were compared with those for lupus (all categories) and multiple sclerosis (MS), and found to be 50% higher for ME and CFS than for lupus or MS, and three to four times higher than for the general insured population. A separate aspect of the study attempted to determine if a diagnosis of ME or CFS could be predicted based on symptom codes in the insurance claims records. Due to the absence of specific codes for some core symptoms, we were unable to validate that the information in insurance claims records is sufficient to identify diagnosed patients or suggest that a diagnosis of ME or CFS should be considered based solely on looking for presence of those symptoms. These results show that a prevalence rate of 857/100,000 for ME/CFS is not unreasonable; therefore, it is not a rare disease, but in fact a relatively common one.

Project description:To phenotype mechanistic differences between heart failure with reduced (HFrEF) and preserved (HFpEF) ejection fraction, a closed-loop model of the cardiovascular system coupled with patient-specific transthoracic echocardiography (TTE) and right heart catheterization (RHC) data was used to identify key parameters representing haemodynamics. Thirty-one patient records (10 HFrEF, 21 HFpEF) were obtained from the Cardiovascular Health Improvement Project database at the University of Michigan. Model simulations were tuned to match RHC and TTE pressure, volume, and cardiac output measurements in each patient. The underlying physiological model parameters were plotted against model-based norms and compared between HFrEF and HFpEF. Our results confirm the main mechanistic parameter driving HFrEF is reduced left ventricular (LV) contractility, whereas HFpEF exhibits a heterogeneous phenotype. Conducting principal component analysis, k -means clustering, and hierarchical clustering on the optimized parameters reveal (i) a group of HFrEF-like HFpEF patients (HFpEF1), (ii) a classic HFpEF group (HFpEF2), and (iii) a group of HFpEF patients that do not consistently cluster (NCC). These subgroups cannot be distinguished from the clinical data alone. Increased LV active contractility ( p<0.001 ) and LV passive stiffness ( p<0.001 ) at rest are observed when comparing HFpEF2 to HFpEF1. Analysing the clinical data of each subgroup reveals that elevated systolic and diastolic LV volumes seen in both HFrEF and HFpEF1 may be used as a biomarker to identify HFrEF-like HFpEF patients. These results suggest that modelling of the cardiovascular system and optimizing to standard clinical data can designate subgroups of HFpEF as separate phenotypes, possibly elucidating patient-specific treatment strategies. KEY POINTS: Analysis of data from right heart catheterization (RHC) and transthoracic echocardiography (TTE) of heart failure (HF) patients using a closed-loop model of the cardiovascular system identifies key parameters representing haemodynamic cardiovascular function in patients with heart failure with reduced and preserved ejection fraction (HFrEF and HFpEF). Analysing optimized parameters representing cardiovascular function using machine learning shows mechanistic differences between HFpEF groups that are not seen analysing clinical data alone. HFpEF groups presented here can be subdivided into three subgroups: HFpEF1 described as 'HFrEF-like HFpEF', HFpEF2 as 'classic HFpEF', and a third group of HFpEF patients that do not consistently cluster. Focusing purely on cardiac function consistently captures the underlying dysfunction in HFrEF, whereas HFpEF is better characterized by dysfunction in the entire cardiovascular system. Our methodology reveals that elevated left ventricular systolic and diastolic volumes are potential biomarkers for identifying HFrEF-like HFpEF patients.

Dataset Information

Machine Learning Informed Diagnosis for Congenital Heart Disease in Large Claims Data Source

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets