Prediction of fatty liver disease using machine learning algorithms

Chieh Chen Wu, Wen Chun Yeh, Wen Ding Hsu, Md Mohaimenul Islam, Phung Anh (Alex) Nguyen, Tahmina Nasrin Poly, Yao Chin Wang, Hsuan Chia Yang, Yu Chuan (Jack) Li

Research output: Contribution to journalArticle

Abstract

Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.

Original languageEnglish
Pages (from-to)23-29
Number of pages7
JournalComputer Methods and Programs in Biomedicine
Volume170
DOIs
Publication statusPublished - Mar 1 2019

Fingerprint

Fatty Liver
Liver
Learning algorithms
Learning systems
Liver Diseases
Logistics
Logistic Models
Neural networks
ROC Curve
Physicians
Machine Learning
Urban Hospitals
Primary Prevention
Early Diagnosis
Screening
Morbidity
Mortality
Forests
Therapeutics

Keywords

  • Classification model
  • Fatty liver disease
  • Machine learning
  • Random forest

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Health Informatics

Cite this

Prediction of fatty liver disease using machine learning algorithms. / Wu, Chieh Chen; Yeh, Wen Chun; Hsu, Wen Ding; Islam, Md Mohaimenul; Nguyen, Phung Anh (Alex); Poly, Tahmina Nasrin; Wang, Yao Chin; Yang, Hsuan Chia; (Jack) Li, Yu Chuan.

In: Computer Methods and Programs in Biomedicine, Vol. 170, 01.03.2019, p. 23-29.

Research output: Contribution to journalArticle

Wu, Chieh Chen ; Yeh, Wen Chun ; Hsu, Wen Ding ; Islam, Md Mohaimenul ; Nguyen, Phung Anh (Alex) ; Poly, Tahmina Nasrin ; Wang, Yao Chin ; Yang, Hsuan Chia ; (Jack) Li, Yu Chuan. / Prediction of fatty liver disease using machine learning algorithms. In: Computer Methods and Programs in Biomedicine. 2019 ; Vol. 170. pp. 23-29.
@article{7c2414a744d04c09b0f87fa7415070aa,
title = "Prediction of fatty liver disease using machine learning algorithms",
abstract = "Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Na{\"i}ve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96{\%}. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.",
keywords = "Classification model, Fatty liver disease, Machine learning, Random forest",
author = "Wu, {Chieh Chen} and Yeh, {Wen Chun} and Hsu, {Wen Ding} and Islam, {Md Mohaimenul} and Nguyen, {Phung Anh (Alex)} and Poly, {Tahmina Nasrin} and Wang, {Yao Chin} and Yang, {Hsuan Chia} and {(Jack) Li}, {Yu Chuan}",
year = "2019",
month = "3",
day = "1",
doi = "10.1016/j.cmpb.2018.12.032",
language = "English",
volume = "170",
pages = "23--29",
journal = "Computer Methods and Programs in Biomedicine",
issn = "0169-2607",
publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Prediction of fatty liver disease using machine learning algorithms

AU - Wu, Chieh Chen

AU - Yeh, Wen Chun

AU - Hsu, Wen Ding

AU - Islam, Md Mohaimenul

AU - Nguyen, Phung Anh (Alex)

AU - Poly, Tahmina Nasrin

AU - Wang, Yao Chin

AU - Yang, Hsuan Chia

AU - (Jack) Li, Yu Chuan

PY - 2019/3/1

Y1 - 2019/3/1

N2 - Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.

AB - Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.

KW - Classification model

KW - Fatty liver disease

KW - Machine learning

KW - Random forest

UR - http://www.scopus.com/inward/record.url?scp=85059666370&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059666370&partnerID=8YFLogxK

U2 - 10.1016/j.cmpb.2018.12.032

DO - 10.1016/j.cmpb.2018.12.032

M3 - Article

AN - SCOPUS:85059666370

VL - 170

SP - 23

EP - 29

JO - Computer Methods and Programs in Biomedicine

JF - Computer Methods and Programs in Biomedicine

SN - 0169-2607

ER -