Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.
ASJC Scopus subject areas
- Computer Science Applications
- Health Informatics