Prediction of protein subchloroplast locations using random forests

Chun Wei Tung, Chyn Liaw, Shinn Jang Ho, Shinn Ying Ho

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Protein subchloroplast locations are correlated with its functions. In contrast to the large amount of available protein sequences, the information of their locations and functions is less known. The experiment works for identification of protein locations and functions are costly and time consuming. The accurate prediction of protein subchloroplast locations can accelerate the study of functions of proteins in chloroplast. This study proposes a Random Forest based method, ChloroRF, to predict protein subchloroplast locations using interpretable physicochemical properties. In addition to high prediction accuracy, the ChloroRF is able to select important physicochemical properties. The important physicochemical properties are also analyzed to provide insights into the underlying mechanism.

Original languageEnglish
Pages (from-to)903-907
Number of pages5
JournalWorld Academy of Science, Engineering and Technology
Volume65
Publication statusPublished - May 1 2010
Externally publishedYes

Fingerprint

Proteins
Experiments

Keywords

  • Chloroplast
  • Physicochemical properties
  • Protein locations
  • Random Forests

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Prediction of protein subchloroplast locations using random forests. / Tung, Chun Wei; Liaw, Chyn; Ho, Shinn Jang; Ho, Shinn Ying.

In: World Academy of Science, Engineering and Technology, Vol. 65, 01.05.2010, p. 903-907.

Research output: Contribution to journalArticle

@article{417d95d06f7641ac99c0ed200234c693,
title = "Prediction of protein subchloroplast locations using random forests",
abstract = "Protein subchloroplast locations are correlated with its functions. In contrast to the large amount of available protein sequences, the information of their locations and functions is less known. The experiment works for identification of protein locations and functions are costly and time consuming. The accurate prediction of protein subchloroplast locations can accelerate the study of functions of proteins in chloroplast. This study proposes a Random Forest based method, ChloroRF, to predict protein subchloroplast locations using interpretable physicochemical properties. In addition to high prediction accuracy, the ChloroRF is able to select important physicochemical properties. The important physicochemical properties are also analyzed to provide insights into the underlying mechanism.",
keywords = "Chloroplast, Physicochemical properties, Protein locations, Random Forests",
author = "Tung, {Chun Wei} and Chyn Liaw and Ho, {Shinn Jang} and Ho, {Shinn Ying}",
year = "2010",
month = "5",
day = "1",
language = "English",
volume = "65",
pages = "903--907",
journal = "World Academy of Science, Engineering and Technology",
issn = "2010-376X",
publisher = "World Academy of Science Engineering and Technology",

}

TY - JOUR

T1 - Prediction of protein subchloroplast locations using random forests

AU - Tung, Chun Wei

AU - Liaw, Chyn

AU - Ho, Shinn Jang

AU - Ho, Shinn Ying

PY - 2010/5/1

Y1 - 2010/5/1

N2 - Protein subchloroplast locations are correlated with its functions. In contrast to the large amount of available protein sequences, the information of their locations and functions is less known. The experiment works for identification of protein locations and functions are costly and time consuming. The accurate prediction of protein subchloroplast locations can accelerate the study of functions of proteins in chloroplast. This study proposes a Random Forest based method, ChloroRF, to predict protein subchloroplast locations using interpretable physicochemical properties. In addition to high prediction accuracy, the ChloroRF is able to select important physicochemical properties. The important physicochemical properties are also analyzed to provide insights into the underlying mechanism.

AB - Protein subchloroplast locations are correlated with its functions. In contrast to the large amount of available protein sequences, the information of their locations and functions is less known. The experiment works for identification of protein locations and functions are costly and time consuming. The accurate prediction of protein subchloroplast locations can accelerate the study of functions of proteins in chloroplast. This study proposes a Random Forest based method, ChloroRF, to predict protein subchloroplast locations using interpretable physicochemical properties. In addition to high prediction accuracy, the ChloroRF is able to select important physicochemical properties. The important physicochemical properties are also analyzed to provide insights into the underlying mechanism.

KW - Chloroplast

KW - Physicochemical properties

KW - Protein locations

KW - Random Forests

UR - http://www.scopus.com/inward/record.url?scp=78751635401&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78751635401&partnerID=8YFLogxK

M3 - Article

VL - 65

SP - 903

EP - 907

JO - World Academy of Science, Engineering and Technology

JF - World Academy of Science, Engineering and Technology

SN - 2010-376X

ER -