Health care utilization routinely generates vast amounts of data from sources ranging from electronic medical records, insurance claims, vital signs, and patient-reported outcomes. Predicting health outcomes using data modeling approaches is an emerging field that can reveal important insights into disproportionate spending patterns. This book presents data driven methods, especially machine learning, for understanding and approaching the high utilizers problem, using the example of a large public insurance program. It describes important goals for data driven approaches from different aspects of the high utilizer problem, and identifies challenges uniquely posed by this problem.
Key Features: Introduces basic elements of health care data, especially for administrative claims data, including disease code, procedure codes, and drug codes Provides tailored supervised and unsupervised machine learning approaches for understanding and predicting the high utilizers Presents descriptive data driven methods for the high utilizer population Identifies a best-fitting linear and tree-based regression model to account for patients’ acute and chronic condition loads and demographic characteristics