Big Data Analytics to Build Risk Assessment Models of Postoperative Complications for Patients with Chronic Pulmonary Disease

Project: A - Government Institutionb - Ministry of Science and Technology

Description

Chronic obstructive pulmonary disease (COPD) is a major respiratory disease with high morbidity and mortality, and representing an increasing burden all over the world. It remains the fourth leading cause of death worldwide in 2008 and expected to be the third in 2020. In United States, the medical costs attributable to COPD and its sequelae were estimated at $32.1 billion in 2010 and projected to $49.0 billion in 2020. Many people have complications after surgery. Postoperative complications cause death and suffering, longer hospital stays, and increase medical costs. However, no large-scale and systematic research was available to assess the risk of postoperative complications. In United States the National Surgical Quality Improvement Program (NSQIP) was the first large scale study for surgical 表CM02 共2 頁 第2 頁 care quality. However, the data of NSQIP were collected from regional Veterans Affairs Medical Centers. The patients and the insurance payment models are too specific to be representative of the general population. Coverage rate of National Health Insurance has reached more than 99% in Taiwan. Research using National Health Insurance Research Database (NHIRD) for analysis would be highly representative. Previous NHIRD studies were mostly analyzed by statistic methods. However, new data mining techniques such as artificial neural networks (ANNs), genetic algorithm, support vector machine (SVM), and sensitivity analysis have been developed for analysis of big data. ANNs may be superior to statistic analysis in some points including the development of non-linear models, tolerance for missing data, and high adaptation etc. Risk assessment models based on ANNs and data mining techniques may have better predictive performance than those based on logistic regression analysis. This study use reimbursement claims data from NHIRD which includes patient demographics, primary and secondary diagnoses, procedures, medical prescriptions, and expenditures as well as all inpatient and outpatient parameters of medical services under the program. We use ANNs, SVM, and multivariate logistic regression to develop independent risk assessment models for postoperative complications including pneumonia, pulmonary embolism, respiratory failure, wound infection, sepsis, acute myocardial infarction, acute renal failure, stroke, or death respectively. The sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve are used to measure the discrimination of the models. Calibration of models is assessed using the Hosmer-Lemeshow goodness-of-fit (H) statistic. The risk assessment models can help surgeons to evaluate the risk of postoperative complications for each patient with chronic pulmonary disease. With help of these models, the medical personnel can be more vigilant and adopt strategies to prevent or decrease the damage of postoperative complications.
StatusFinished
Effective start/end date8/1/1610/31/17