Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
'),o.close()}("https://assets.zendesk.com/embeddable_framework/main.js","jmir.zendesk.com");/*]]>*/

Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR AI

Date Submitted: Mar 16, 2024
Open Peer Review Period: Apr 18, 2024 - Jun 13, 2024
(currently open for review)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Machine-learning based prediction for high health care utilizers using a multi-institution diabetes registry: model training and evaluation.

  • Joshua Kuan Tan; 
  • Le Quan; 
  • Nur Nasyitah Mohamed Salim; 
  • Jen Hong Tan; 
  • Su-Yen Goh; 
  • Julian Thumboo; 
  • Yong Mong Bee

ABSTRACT

Background:

The cost of healthcare in many countries is increasing rapidly. There is a growing interest in using machine learning to predict high healthcare utilizers for population health initiatives. Previous studies have focused on individuals who contribute to the highest financial burden. However, this group is small and represents a limited opportunity for long-term cost reduction.

Objective:

We developed an ensemble of models that predict future healthcare utilization at various thresholds.

Methods:

We utilized data from a multi-institutional diabetes database from the year 2019 to develop binary classification models. These models predict healthcare utilization in the subsequent year across six different outcomes: patients having a length of stay of ≥7, ≥14, and ≥30 days, and emergency department (ED) attendance of ≥3, ≥5, and ≥10 visits. To address class imbalance, random and synthetic minority oversampling techniques were employed. The models were then applied to unseen data from 2020 and 2021 to predict healthcare utilization in the following year. A portfolio of performance metrics, with a priority on area under the receiver operating curve (AUC), sensitivity and positive predictive value was used for comparison.

Results:

When trained with random oversampling, four models – logistic regression, multivariate adaptive regression splines, boosted trees, and multilayer perceptron – consistently achieved high AUC (>0.80) and sensitivity (>0.60) across training-validation and test datasets. Correcting for class imbalance proved critical for model performance. Key predictors for all outcomes included age, number of ED visits in the present year, chronic kidney disease stage, inpatient bed days in the present year, and mean HbA1c levels.

Conclusions:

We successfully developed machine learning models capable of predicting high service level utilization with robust performance. These models can be integrated into wider diabetes-related population health initiatives. Clinical Trial: Not Applicable


 Citation

Please cite as:

Tan JK, Quan L, Salim NNM, Tan JH, Goh SY, Thumboo J, Bee YM

Machine-learning based prediction for high health care utilizers using a multi-institution diabetes registry: model training and evaluation.

JMIR Preprints. 16/03/2024:58463

DOI: 10.2196/preprints.58463

URL: https://preprints.jmir.org/preprint/58463

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.

Advertisement