Gene selection for cancer identification: A decision tree model empowered by particle swarm optimization algorithm

Kun Huang Chen, Kung Jeng Wang, Min Lung Tsai, Kung Min Wang, Angelia M. Adrian, Wei Chung Cheng, Tzu Sen Yang, Nai Chia Teng, Kuo Pin Tan, Ku Shang Chang

Research output: Contribution to journalArticle

66 Citations (Scopus)

Abstract

Background: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data.Results: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets.Conclusion: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.

Original languageEnglish
Article number49
JournalBMC Bioinformatics
Volume15
Issue number1
DOIs
Publication statusPublished - Feb 20 2014

Fingerprint

Gene Selection
Decision Trees
Neoplasm Genes
Decision trees
Particle Swarm Optimization Algorithm
Decision tree
Particle swarm optimization (PSO)
Identification (control systems)
Cancer
Genes
Gene
Neoplasms
Gene expression
Classifier
Model
Classifiers
Cancer Classification
Gene Expression
Benchmarking
Naive Bayes

Keywords

  • Cancer
  • Decision tree classifier
  • Gene expression
  • Particle swarm optimization

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics
  • Structural Biology

Cite this

Chen, K. H., Wang, K. J., Tsai, M. L., Wang, K. M., Adrian, A. M., Cheng, W. C., ... Chang, K. S. (2014). Gene selection for cancer identification: A decision tree model empowered by particle swarm optimization algorithm. BMC Bioinformatics, 15(1), [49]. https://doi.org/10.1186/1471-2105-15-49

Gene selection for cancer identification : A decision tree model empowered by particle swarm optimization algorithm. / Chen, Kun Huang; Wang, Kung Jeng; Tsai, Min Lung; Wang, Kung Min; Adrian, Angelia M.; Cheng, Wei Chung; Yang, Tzu Sen; Teng, Nai Chia; Tan, Kuo Pin; Chang, Ku Shang.

In: BMC Bioinformatics, Vol. 15, No. 1, 49, 20.02.2014.

Research output: Contribution to journalArticle

Chen, Kun Huang ; Wang, Kung Jeng ; Tsai, Min Lung ; Wang, Kung Min ; Adrian, Angelia M. ; Cheng, Wei Chung ; Yang, Tzu Sen ; Teng, Nai Chia ; Tan, Kuo Pin ; Chang, Ku Shang. / Gene selection for cancer identification : A decision tree model empowered by particle swarm optimization algorithm. In: BMC Bioinformatics. 2014 ; Vol. 15, No. 1.
@article{ac12e58ff81745118875d8dada1caa87,
title = "Gene selection for cancer identification: A decision tree model empowered by particle swarm optimization algorithm",
abstract = "Background: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data.Results: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets.Conclusion: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.",
keywords = "Cancer, Decision tree classifier, Gene expression, Particle swarm optimization",
author = "Chen, {Kun Huang} and Wang, {Kung Jeng} and Tsai, {Min Lung} and Wang, {Kung Min} and Adrian, {Angelia M.} and Cheng, {Wei Chung} and Yang, {Tzu Sen} and Teng, {Nai Chia} and Tan, {Kuo Pin} and Chang, {Ku Shang}",
year = "2014",
month = "2",
day = "20",
doi = "10.1186/1471-2105-15-49",
language = "English",
volume = "15",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Gene selection for cancer identification

T2 - A decision tree model empowered by particle swarm optimization algorithm

AU - Chen, Kun Huang

AU - Wang, Kung Jeng

AU - Tsai, Min Lung

AU - Wang, Kung Min

AU - Adrian, Angelia M.

AU - Cheng, Wei Chung

AU - Yang, Tzu Sen

AU - Teng, Nai Chia

AU - Tan, Kuo Pin

AU - Chang, Ku Shang

PY - 2014/2/20

Y1 - 2014/2/20

N2 - Background: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data.Results: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets.Conclusion: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.

AB - Background: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data.Results: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets.Conclusion: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.

KW - Cancer

KW - Decision tree classifier

KW - Gene expression

KW - Particle swarm optimization

UR - http://www.scopus.com/inward/record.url?scp=84897645454&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897645454&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-15-49

DO - 10.1186/1471-2105-15-49

M3 - Article

C2 - 24555567

AN - SCOPUS:84897645454

VL - 15

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - 1

M1 - 49

ER -