Biopathway is a basic presentation how the cell works so that it also is a basis for the mechanism to diseases and drugs. Although the human gene sequences have been decoded, the regulations and interactions between genes for biological processes are still under investigations. Recently a number of experimental data for protein interactions and gene expressions have been published to public database that provide a good material for researches understanding how genes work each other. Therefore, in this study, we attempt to develop a biopathway construction system by applying data mining and machine learning algorithm to the published protein-protein interaction (PPI) and gene expression data. Because of the high-throughput PPI experiment techniques, many PPI-related databases and support tools were produced for further investigations. Since functional regions, e.g. domains and motifs, are key components on whether one protein interacts with another protein, it is practicable to apply computational methods to figure out functional regions in regard to PPIs. Hence, in first year we attempt to apply machine learning to produce a PPI prediction module based on public domain protein interaction data and be validated by conserved PPI data. The gene expression data from microarray experiments provide good references about the collaboration or inhibition between genes. Combing with PPI information the biopathways can be derived. In second year, we will use the gene expression data to develop biopathways construction and validation methods. In third year, the interacted protein functional region networks will be constructed and be mapped to collected signaling transduction pathways. Finally we attempt to apply such prediction model to discover novel biopathways of biological process and to build probable biopathway networks. The system developed in this project will be a useful tool for implementation for E-cell project.
|Effective start/end date||8/1/11 → 7/31/12|
- Protein-Protein Interactions
- Data Mining
- Machine Learning
- Gene_x000d_ Expression