A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains

Hsuan Lin Her, Yu Wei Wu

研究成果: 雜誌貢獻文章

6 引文 (Scopus)

摘要

Motivation: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. Results: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics.
原文英語
頁(從 - 到)i89-i95
期刊Bioinformatics
34
發行號13
DOIs
出版狀態已發佈 - 七月 1 2018

指紋

Escherichia coli
Escherichia Coli
Learning systems
Machine Learning
Genome
Genes
Antibiotics
Gene
Anti-Bacterial Agents
Accessories
Multigene Family
Drugs
Resistance
Genetic Algorithm
Developed Countries
Pharmaceutical Preparations
Developing Countries
Performance Prediction
Genetic algorithms
Inspection

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

引用此文

A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains. / Her, Hsuan Lin; Wu, Yu Wei.

於: Bioinformatics, 卷 34, 編號 13, 01.07.2018, p. i89-i95.

研究成果: 雜誌貢獻文章

@article{72258b447e404181a7e6e062c2db0bef,
title = "A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains",
abstract = "Motivation: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. Results: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics.",
author = "Her, {Hsuan Lin} and Wu, {Yu Wei}",
year = "2018",
month = "7",
day = "1",
doi = "10.1093/bioinformatics/bty276",
language = "English",
volume = "34",
pages = "i89--i95",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "13",

}

TY - JOUR

T1 - A pan-genome-based machine learning approach for predicting antimicrobial resistance activities of the Escherichia coli strains

AU - Her, Hsuan Lin

AU - Wu, Yu Wei

PY - 2018/7/1

Y1 - 2018/7/1

N2 - Motivation: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. Results: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics.

AB - Motivation: Antimicrobial resistance (AMR) is becoming a huge problem in both developed and developing countries, and identifying strains resistant or susceptible to certain antibiotics is essential in fighting against antibiotic-resistant pathogens. Whole-genome sequences have been collected for different microbial strains in order to identify crucial characteristics that allow certain strains to become resistant to antibiotics; however, a global inspection of the gene content responsible for AMR activities remains to be done. Results: We propose a pan-genome-based approach to characterize antibiotic-resistant microbial strains and test this approach on the bacterial model organism Escherichia coli. By identifying core and accessory gene clusters and predicting AMR genes for the E. coli pan-genome, we not only showed that certain classes of genes are unevenly distributed between the core and accessory parts of the pan-genome but also demonstrated that only a portion of the identified AMR genes belong to the accessory genome. Application of machine learning algorithms to predict whether specific strains were resistant to antibiotic drugs yielded the best prediction accuracy for the set of AMR genes within the accessory part of the pan-genome, suggesting that these gene clusters were most crucial to AMR activities in E. coli. Selecting subsets of AMR genes for different antibiotic drugs based on a genetic algorithm (GA) achieved better prediction performances than the gene sets established in the literature, hinting that the gene sets selected by the GA may warrant further analysis in investigating more details about how E. coli fight against antibiotics.

UR - http://www.scopus.com/inward/record.url?scp=85050805712&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050805712&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bty276

DO - 10.1093/bioinformatics/bty276

M3 - Article

AN - SCOPUS:85050805712

VL - 34

SP - i89-i95

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 13

ER -