Two-Step Predictive Model for Missed Appointments at Outpatient Primary Care Settings Serving Rural Areas
General Material Designation
[Thesis]
First Statement of Responsibility
Abu Lekham, Laith
Subsequent Statement of Responsibility
Khasawneh, Mohammad T.
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
State University of New York at Binghamton
Date of Publication, Distribution, etc.
2020
GENERAL NOTES
Text of Note
145 p.
DISSERTATION (THESIS) NOTE
Dissertation or thesis details and type of degree
M.S.
Body granting the degree
State University of New York at Binghamton
Text preceding or following the note
2020
SUMMARY OR ABSTRACT
Text of Note
Missed appointments are a significant cause of inefficiency in the healthcare industry. Many researchers have studied this problem in various healthcare settings. However, limited research has been conducted that is concerned with predicting missed appointments at outpatient primary care settings serving rural areas. This study holistically investigates the factors behind two types of missed appointments - no shows and cancellations - at an outpatient primary care medical center serving rural areas and develops a predictive model to reduce their incidence. The study was carried out in three main phases. First, exploratory data analysis was conducted to discover the patterns related to the missed appointments. Also, a text mining framework was developed to conduct a root cause analysis (RCA) and Pareto analysis. Second, the association between some attributes and appointment status was analyzed using Chi-square and Welch's t-test. Third, a two-step predictive model for the missed appointments was built using machine learning classifiers. The first step of the model is to predict the missed appointments in more than one-week notice without using the weather forecasts. The second step of the model is to predict the missed appointments in less than a one-week notice by incorporating the weather forecasts in the model. It was found that appointment lead time is a key driver for missed appointments. The longer the lead time, the more likely a patient is to miss an appointment. Also, the missed appointment rate decreases significantly as the day progresses. Based on the text mining framework for RCA and Pareto analysis, it was found that most of the missed appointments are either related to the patient or processes. The ensemble classifiers performed the best among all the classifiers after tuning their hyperparameters with an average accuracy of more than 93%. Also, most of the classifiers showed moderate variability in their performance. Incorporating the weather as an external variable improved the performance of the model significantly with an average best accuracy of 99%. Based on this analysis, some interventions were proposed to reduce the missed appointments rate such as reducing appointment lead time and using the predictive model in scheduling patients.