Biomarker Trajectory Prediction and Causal Analysis of the Impact of the Covid-19 Pandemic on CVD Patients using Machine Learning
dc.contributor.author | Inekwe, Trusting | |
dc.contributor.author | Mkandawire, Winnie | |
dc.contributor.author | Wee, Brian | |
dc.contributor.author | Agu, Emmanuel | |
dc.contributor.author | Colubri, Andres | |
dc.date.accessioned | 2024-10-11T18:45:56Z | |
dc.date.available | 2024-10-11T18:45:56Z | |
dc.date.issued | 2024-08-05 | |
dc.identifier.citation | T. Inekwe, W. Mkandawire, B. Wee, E. Agu and A. Colubri, "Biomarker Trajectory Prediction and Causal Analysis of the Impact of the Covid-19 Pandemic on CVD Patients using Machine Learning," 2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), Wilmington, DE, USA, 2024, pp. 1-12, doi: 10.1109/CHASE60773.2024.00011. | en_US |
dc.identifier.doi | 10.1109/CHASE60773.2024.00011 | en_US |
dc.identifier.eid | 2-s2.0-85201196524 | |
dc.identifier.scopusid | SCOPUS_ID:85201196524 | |
dc.identifier.uri | http://hdl.handle.net/20.500.14038/53857 | |
dc.description | Emmanuel Agu of Worcester Polytechnic Institute is an affiliated faculty member of the Center for Accelerating Practices to End Suicide (CAPES) headquartered at UMass Chan Medical School. | en_US |
dc.description.abstract | Background: The COVID-19 pandemic disrupted healthcare services, increasing the susceptibility of high-risk patients including those with cardiovascular Diseases (CVDs), to adverse outcomes. Biomarkers provide insights into patients' underlying health status. However, few studies have investigated the effects of the COVID-19 pandemic on CVD biomarker trajectories using predictive modeling and causal analyses frameworks. Prior research explored the impacts of the COVID-19 pandemic on CVD severity and prognosis but did not investigate biomarker trajectories using Machine Learning (ML), which can discover complex multivariate relationships in multi-modal data. Objective: This study aimed to compare six ML regression models to select the best performing models for predicting biomarker trajectories in CVD patients using retrospective data. Subsequently, these models were used to assess the COVID-19 pandemic's impact on CVD patients and for causal analyses Approach: Using ML regression and causal inference, this study investigated the pandemic's impact on biomarker values of 80,917 CVD patients and 77,332 non-CVD controls, treated at two hospitals in Central Massachusetts between May 2018 and December 2021. ML regression algorithms, including Neural Networks (NN), Decision Trees (DT), Random Forests (RF), XGBoost, CATBoost and ADABoost, were trained and compared. Important CVD biomarkers (HbA1c, LDL cholesterol, BMI, and BP) were predicted as outcome variables with patients' risk factors (age, race, gender, socioeconomic status) as input variables. Shapley feature importance analyses identified the most predictive features, which were then utilized in Causal Analysis. A Difference-in-Differences (DID) approach within a Double/Debiased Machine Learning (DML) method isolated the pandemic's impact on biomarkers, while minimizing the effects of confounding factors. Results: CATBoost and XGBoost were the most predictive ML models for LDL cholesterol and HbA1c, yielding R 2 values of 0.13 and 0.10, respectively. RF outperformed other models for BMI and BP, achieving R 2 values of 0.192 and 0.071. The small R 2 values were due to the prevalence of categorical features in the data with substantial variation in biomarker values. Feature importance analysis determined age, socioeconomic status, and race/ethnicity to be important drivers of biomarker changes, highlighting the role of social determinants of health. DML with DID analysis revealed a statistically significant increase (p-value <0.05) in BMI and systolic BP values for CVD patients during the COVID-19 pandemic compared to the control group, their HbA1c and LDL cholesterol values actually improved during the pandemic, suggesting differential effects of the pandemic on key CVD biomarkers. Conclusion: Our proposed ML biomarker prediction models can facilitate personalized interventions and advance risk assessment for CVD patients. The predictive importance of factors such as age, socioeconomic status, and race highlights the need to address health disparities. | en_US |
dc.relation.ispartof | 2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) | en_US |
dc.relation.url | https://doi.org/10.1109/CHASE60773.2024.00011 | en_US |
dc.subject | biomarkers | en_US |
dc.subject | Cardiovascular Diseases | en_US |
dc.subject | causal inference | en_US |
dc.subject | COVID-19 | en_US |
dc.subject | Difference-in-Differences | en_US |
dc.subject | Double/Debiased machine learning | en_US |
dc.subject | machine learning | en_US |
dc.subject | predictive modeling | en_US |
dc.subject | regression | en_US |
dc.subject | SHAP value | en_US |
dc.title | Biomarker Trajectory Prediction and Causal Analysis of the Impact of the Covid-19 Pandemic on CVD Patients using Machine Learning | en_US |
dc.type | Conference Paper | en_US |
dc.source.journaltitle | Proceedings - 2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE 2024 | |
dc.source.beginpage | 1 | |
dc.source.endpage | 12 | |
dc.identifier.journal | Proceedings - 2024 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies, CHASE 2024 | |
dc.contributor.department | Center for Accelerating Practices to End Suicide (CAPES) | en_US |
dc.contributor.department | Genomics and Computational Biology | en_US |
dc.contributor.department | Morningside Graduate School of Biomedical Sciences | en_US |
dc.contributor.student | Winnie Mkandawire |