Prediction of fatty liver disease using machine learning algorithms

Chieh Chen Wu, Wen Chun Yeh, Wen Ding Hsu, Md Mohaimenul Islam, Phung Anh (Alex) Nguyen, Tahmina Nasrin Poly, Yao Chin Wang, Hsuan Chia Yang, Yu Chuan (Jack) Li

研究成果: 雜誌貢獻文章

摘要

Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.
原文英語
頁(從 - 到)23-29
頁數7
期刊Computer Methods and Programs in Biomedicine
170
DOIs
出版狀態已發佈 - 三月 1 2019

指紋

Fatty Liver
Liver
Learning algorithms
Learning systems
Liver Diseases
Logistics
Logistic Models
Neural networks
ROC Curve
Physicians
Machine Learning
Urban Hospitals
Primary Prevention
Early Diagnosis
Screening
Morbidity
Mortality
Forests
Therapeutics

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Health Informatics

引用此文

Prediction of fatty liver disease using machine learning algorithms. / Wu, Chieh Chen; Yeh, Wen Chun; Hsu, Wen Ding; Islam, Md Mohaimenul; Nguyen, Phung Anh (Alex); Poly, Tahmina Nasrin; Wang, Yao Chin; Yang, Hsuan Chia; (Jack) Li, Yu Chuan.

於: Computer Methods and Programs in Biomedicine, 卷 170, 01.03.2019, p. 23-29.

研究成果: 雜誌貢獻文章

Wu, Chieh Chen ; Yeh, Wen Chun ; Hsu, Wen Ding ; Islam, Md Mohaimenul ; Nguyen, Phung Anh (Alex) ; Poly, Tahmina Nasrin ; Wang, Yao Chin ; Yang, Hsuan Chia ; (Jack) Li, Yu Chuan. / Prediction of fatty liver disease using machine learning algorithms. 於: Computer Methods and Programs in Biomedicine. 2019 ; 卷 170. 頁 23-29.
@article{7c2414a744d04c09b0f87fa7415070aa,
title = "Prediction of fatty liver disease using machine learning algorithms",
abstract = "Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Na{\"i}ve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96{\%}. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.",
keywords = "Classification model, Fatty liver disease, Machine learning, Random forest",
author = "Wu, {Chieh Chen} and Yeh, {Wen Chun} and Hsu, {Wen Ding} and Islam, {Md Mohaimenul} and Nguyen, {Phung Anh (Alex)} and Poly, {Tahmina Nasrin} and Wang, {Yao Chin} and Yang, {Hsuan Chia} and {(Jack) Li}, {Yu Chuan}",
year = "2019",
month = "3",
day = "1",
doi = "10.1016/j.cmpb.2018.12.032",
language = "English",
volume = "170",
pages = "23--29",
journal = "Computer Methods and Programs in Biomedicine",
issn = "0169-2607",
publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Prediction of fatty liver disease using machine learning algorithms

AU - Wu, Chieh Chen

AU - Yeh, Wen Chun

AU - Hsu, Wen Ding

AU - Islam, Md Mohaimenul

AU - Nguyen, Phung Anh (Alex)

AU - Poly, Tahmina Nasrin

AU - Wang, Yao Chin

AU - Yang, Hsuan Chia

AU - (Jack) Li, Yu Chuan

PY - 2019/3/1

Y1 - 2019/3/1

N2 - Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.

AB - Background and objective: Fatty liver disease (FLD) is a common clinical complication; it is associated with high morbidity and mortality. However, an early prediction of FLD patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to develop a machine learning model to predict FLD that could assist physicians in classifying high-risk patients and make a novel diagnosis, prevent and manage FLD. Methods: We included all patients who had an initial fatty liver screening at the New Taipei City Hospital between 1st and 31st December 2009. Classification models such as random forest (RF), Naïve Bayes (NB), artificial neural networks (ANN), and logistic regression (LR) were developed to predict FLD. The area under the receiver operating characteristic curve (ROC) was used to evaluate performances among the four models. Results: A total of 577 patients were included in this study; of those 377 patients had fatty liver. The area under the receiver operating characteristic (AUROC) of RF, NB, ANN, and LR with 10 fold-cross validation was 0.925, 0.888, 0.895, and 0.854 respectively. Additionally, The accuracy of RF, NB, ANN, and LR 87.48, 82.65, 81.85, and 76.96%. Conclusion: In this study, we developed and compared the four classification models to predict fatty liver disease accurately. However, the random forest model showed higher performance than other classification models. Implementation of a random forest model in the clinical setting could help physicians to stratify fatty liver patients for primary prevention, surveillance, early treatment, and management.

KW - Classification model

KW - Fatty liver disease

KW - Machine learning

KW - Random forest

UR - http://www.scopus.com/inward/record.url?scp=85059666370&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059666370&partnerID=8YFLogxK

U2 - 10.1016/j.cmpb.2018.12.032

DO - 10.1016/j.cmpb.2018.12.032

M3 - Article

VL - 170

SP - 23

EP - 29

JO - Computer Methods and Programs in Biomedicine

JF - Computer Methods and Programs in Biomedicine

SN - 0169-2607

ER -