Protein subcellular localization prediction based on compartment-specific biological features.

Chia Yu Su, Allan Lo, Hua Sheng Chiu, Ting Yi Sung, Wen Lian Hsu

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Prediction of subcellular localization of proteins is important for genome annotation, protein function prediction, and drug discovery. We present a prediction method for Gram-negative bacteria that uses ten one-versus-one support vector machine (SVM) classifiers, where compartment-specific biological features are selected as input to each SVM classifier. The final prediction of localization sites is determined by integrating the results from ten binary classifiers using a combination of majority votes and a probabilistic method. The overall accuracy reaches 91.4%, which is 1.6% better than the state-of-the-art system, in a ten-fold cross-validation evaluation on a benchmark data set. We demonstrate that feature selection guided by biological knowledge and insights in one-versus-one SVM classifiers can lead to a significant improvement in the prediction performance. Our model is also used to produce highly accurate prediction of 92.8% overall accuracy for proteins of dual localizations.

Original languageEnglish
Pages (from-to)325-330
Number of pages6
JournalComputational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference
Publication statusPublished - 2006
Externally publishedYes

Fingerprint

Molecular Sequence Annotation
Benchmarking
Proteins
Drug Discovery
Gram-Negative Bacteria
Genome
Support Vector Machine
Datasets

ASJC Scopus subject areas

  • Medicine(all)

Cite this

@article{20527e6766a2475eb6ac6974f09cc16c,
title = "Protein subcellular localization prediction based on compartment-specific biological features.",
abstract = "Prediction of subcellular localization of proteins is important for genome annotation, protein function prediction, and drug discovery. We present a prediction method for Gram-negative bacteria that uses ten one-versus-one support vector machine (SVM) classifiers, where compartment-specific biological features are selected as input to each SVM classifier. The final prediction of localization sites is determined by integrating the results from ten binary classifiers using a combination of majority votes and a probabilistic method. The overall accuracy reaches 91.4{\%}, which is 1.6{\%} better than the state-of-the-art system, in a ten-fold cross-validation evaluation on a benchmark data set. We demonstrate that feature selection guided by biological knowledge and insights in one-versus-one SVM classifiers can lead to a significant improvement in the prediction performance. Our model is also used to produce highly accurate prediction of 92.8{\%} overall accuracy for proteins of dual localizations.",
author = "Su, {Chia Yu} and Allan Lo and Chiu, {Hua Sheng} and Sung, {Ting Yi} and Hsu, {Wen Lian}",
year = "2006",
language = "English",
pages = "325--330",
journal = "Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference",
issn = "1752-7791",
publisher = "Imperial College Press",

}

TY - JOUR

T1 - Protein subcellular localization prediction based on compartment-specific biological features.

AU - Su, Chia Yu

AU - Lo, Allan

AU - Chiu, Hua Sheng

AU - Sung, Ting Yi

AU - Hsu, Wen Lian

PY - 2006

Y1 - 2006

N2 - Prediction of subcellular localization of proteins is important for genome annotation, protein function prediction, and drug discovery. We present a prediction method for Gram-negative bacteria that uses ten one-versus-one support vector machine (SVM) classifiers, where compartment-specific biological features are selected as input to each SVM classifier. The final prediction of localization sites is determined by integrating the results from ten binary classifiers using a combination of majority votes and a probabilistic method. The overall accuracy reaches 91.4%, which is 1.6% better than the state-of-the-art system, in a ten-fold cross-validation evaluation on a benchmark data set. We demonstrate that feature selection guided by biological knowledge and insights in one-versus-one SVM classifiers can lead to a significant improvement in the prediction performance. Our model is also used to produce highly accurate prediction of 92.8% overall accuracy for proteins of dual localizations.

AB - Prediction of subcellular localization of proteins is important for genome annotation, protein function prediction, and drug discovery. We present a prediction method for Gram-negative bacteria that uses ten one-versus-one support vector machine (SVM) classifiers, where compartment-specific biological features are selected as input to each SVM classifier. The final prediction of localization sites is determined by integrating the results from ten binary classifiers using a combination of majority votes and a probabilistic method. The overall accuracy reaches 91.4%, which is 1.6% better than the state-of-the-art system, in a ten-fold cross-validation evaluation on a benchmark data set. We demonstrate that feature selection guided by biological knowledge and insights in one-versus-one SVM classifiers can lead to a significant improvement in the prediction performance. Our model is also used to produce highly accurate prediction of 92.8% overall accuracy for proteins of dual localizations.

UR - http://www.scopus.com/inward/record.url?scp=34250876252&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34250876252&partnerID=8YFLogxK

M3 - Article

SP - 325

EP - 330

JO - Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference

JF - Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference

SN - 1752-7791

ER -