ViralPhos

Incorporating a recursively statistical method to predict phosphorylation sites on virus proteins

Kai Yao Huang, Cheng Tsung Lu, Neil A. Bretaña, Tzong Yi Lee, Tzu Hao Chang

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Background: The phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein.Results: 329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1.Conclusion: In this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.

Original languageEnglish
Article numberS10
JournalBMC Bioinformatics
Volume14
Issue numberSUPPL16
DOIs
Publication statusPublished - Oct 22 2013

Fingerprint

Phosphorylation
Viruses
Statistical method
Virus
Statistical methods
Proteins
Protein
Predict
Substrate
Phosphotransferases
Substrates
Protein Kinases
Support vector machines
Support Vector Machine
Drug Design
Bioelectric potentials
Web Server
Predictive Model
Threonine
Bioinformatics

Keywords

  • Protein phosphorylation
  • Substrate motif
  • Support vector machine
  • Virus

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics
  • Structural Biology

Cite this

ViralPhos : Incorporating a recursively statistical method to predict phosphorylation sites on virus proteins. / Huang, Kai Yao; Lu, Cheng Tsung; Bretaña, Neil A.; Lee, Tzong Yi; Chang, Tzu Hao.

In: BMC Bioinformatics, Vol. 14, No. SUPPL16, S10, 22.10.2013.

Research output: Contribution to journalArticle

Huang, Kai Yao ; Lu, Cheng Tsung ; Bretaña, Neil A. ; Lee, Tzong Yi ; Chang, Tzu Hao. / ViralPhos : Incorporating a recursively statistical method to predict phosphorylation sites on virus proteins. In: BMC Bioinformatics. 2013 ; Vol. 14, No. SUPPL16.
@article{6003acfe53834315a84eb57d5beb2b92,
title = "ViralPhos: Incorporating a recursively statistical method to predict phosphorylation sites on virus proteins",
abstract = "Background: The phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein.Results: 329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1.Conclusion: In this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.",
keywords = "Protein phosphorylation, Substrate motif, Support vector machine, Virus",
author = "Huang, {Kai Yao} and Lu, {Cheng Tsung} and Breta{\~n}a, {Neil A.} and Lee, {Tzong Yi} and Chang, {Tzu Hao}",
year = "2013",
month = "10",
day = "22",
doi = "10.1186/1471-2105-14-S16-S10",
language = "English",
volume = "14",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPL16",

}

TY - JOUR

T1 - ViralPhos

T2 - Incorporating a recursively statistical method to predict phosphorylation sites on virus proteins

AU - Huang, Kai Yao

AU - Lu, Cheng Tsung

AU - Bretaña, Neil A.

AU - Lee, Tzong Yi

AU - Chang, Tzu Hao

PY - 2013/10/22

Y1 - 2013/10/22

N2 - Background: The phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein.Results: 329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1.Conclusion: In this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.

AB - Background: The phosphorylation of virus proteins by host kinases is linked to viral replication. This leads to an inhibition of normal host-cell functions. Further elucidation of phosphorylation in virus proteins is required in order to aid in drug design and treatment. However, only a few studies have investigated substrate motifs in identifying virus phosphorylation sites. Additionally, existing bioinformatics tool do not consider potential host kinases that may initiate the phosphorylation of a virus protein.Results: 329 experimentally verified phosphorylation fragments on 111 virus proteins were collected from virPTM. These were clustered into subgroups of significantly conserved motifs using a recursively statistical method. Two-layered Support Vector Machines (SVMs) were then applied to train a predictive model for the identified substrate motifs. The SVM models were evaluated using a five-fold cross validation which yields an average accuracy of 0.86 for serine, and 0.81 for threonine. Furthermore, the proposed method is shown to perform at par with three other phosphorylation site prediction tools: PPSP, KinasePhos 2.0 and GPS 2.1.Conclusion: In this study, we propose a computational method, ViralPhos, which aims to investigate virus substrate site motifs and identify potential phosphorylation sites on virus proteins. We identified informative substrate motifs that matched with several well-studied kinase groups as potential catalytic kinases for virus protein substrates. The identified substrate motifs were further exploited to identify potential virus phosphorylation sites. The proposed method is shown to be capable of predicting virus phosphorylation sites and has been implemented as a web server http://csb.cse.yzu.edu.tw/ViralPhos/.

KW - Protein phosphorylation

KW - Substrate motif

KW - Support vector machine

KW - Virus

UR - http://www.scopus.com/inward/record.url?scp=84886789765&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886789765&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-14-S16-S10

DO - 10.1186/1471-2105-14-S16-S10

M3 - Article

VL - 14

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPL16

M1 - S10

ER -