Abstract
We present a novel method to address multi-labeled protein subcellular localization prediction in Gram-negative bacteria using support vector machines (SVM) as classifiers. For a given protein sequence that may have more than one label, features are extracted from amino acid composition and molecular function related terms in Gene Ontology (GO) as input to SVM. We apply one-against-others SVM to proteins of Gram-negative bacteria in a 5-fold cross-validation. The results of the multi-labeled predictions are evaluated based on two criteria: class number and class category. For the first criterion, our method predicts the number of classes (class number) for each protein at an accuracy rate of 94.1%. For the second criterion, we compare the categories of the actual classes with the predicted classes proportionate to ranks, and obtain an accuracy of 83.2%. Our method is the first approach to predict and evaluate multi-labeled protein subcellular localization for prokaryotic bacteria and we demonstrate that it has a good predictive power.
Original language | English |
---|---|
Title of host publication | 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts |
Pages | 79-80 |
Number of pages | 2 |
DOIs | |
Publication status | Published - 2005 |
Externally published | Yes |
Event | 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts - Stanford, CA, United States Duration: Aug 8 2005 → Aug 11 2005 |
Other
Other | 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts |
---|---|
Country | United States |
City | Stanford, CA |
Period | 8/8/05 → 8/11/05 |
ASJC Scopus subject areas
- Engineering(all)