Even though many genetic risk loci for human diseases have been identified and comprehensively cataloged, strategies to guide clinical research by integrating the extensive results of genetic studies and biological resources are still limited. Moreover, integrative analyses that provide novel insights into disease biology are expected to be especially useful for drug discovery. Herein, we used text mining of genetic studies on colorectal cancer (CRC) and assigned biological annotations to identified risk genes in order to discover novel drug targets and potential drugs for repurposing. Risk genes for CRC were obtained from PubMed text mining, and for each gene, six functional and bioinformatic annotations were analyzed. The annotations include missense mutations, cis-expression quantitative trait loci (cis-eQTL), molecular pathway analyses, protein-protein interactions (PPIs), a genetic overlap with knockout mouse phenotypes, and primary immunodeficiency (PID). We then prioritized the biological risk candidate genes according to a scoring system of the six functional annotations. Each functional annotation was assigned one point, and those genes with a score ≥2 were designated “biological CRC risk genes”. Using this method, we revealed 82 biological CRC risk genes, which were mapped to 128 genes in an expanded PPI network. Further utilizing DrugBank and the Therapeutic Target Database, we found 21 genes in our list that are targeted by 166 candidate drugs. Based on data from ClinicalTrials.gov and literature review, we found four known target genes with six drugs for clinical treatment in CRC, and three target genes with nine drugs supported by previous preclinical results in CRC. Additionally, 12 genes are targeted by 32 drugs approved for other indications, which can possibly be repurposed for CRC treatment. Finally, analysis from Connectivity Map (CMap) showed that 18 drugs have a high potential for CRC.
ASJC Scopus subject areas