An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm

Bo Han Su, Meng Yu Shen, Yeu Chern Harn, San Yuan Wang, Alioune Schurz, Chieh Lin, Olivia A. Lin, Yufeng J. Tseng

Research output: Contribution to journalArticle

Abstract

The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).

Original languageEnglish
Article number57
JournalJournal of Cheminformatics
Volume9
Issue number1
DOIs
Publication statusPublished - Nov 15 2017
Externally publishedYes

Fingerprint

dynamic programming
Biological Products
Dynamic programming
programming
products
Polynomials
lists
polynomials
Computational complexity
drugs
web services
drug
Scaffolds
Web services
Mass spectrometry
chemist
floating
Molecular weight
mass spectra
integers

Keywords

  • CASE
  • Dynamic programming
  • Natural products
  • Polynomial time

ASJC Scopus subject areas

  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Computer Graphics and Computer-Aided Design
  • Library and Information Sciences

Cite this

An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm. / Su, Bo Han; Shen, Meng Yu; Harn, Yeu Chern; Wang, San Yuan; Schurz, Alioune; Lin, Chieh; Lin, Olivia A.; Tseng, Yufeng J.

In: Journal of Cheminformatics, Vol. 9, No. 1, 57, 15.11.2017.

Research output: Contribution to journalArticle

Su, Bo Han ; Shen, Meng Yu ; Harn, Yeu Chern ; Wang, San Yuan ; Schurz, Alioune ; Lin, Chieh ; Lin, Olivia A. ; Tseng, Yufeng J. / An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm. In: Journal of Cheminformatics. 2017 ; Vol. 9, No. 1.
@article{980df4e160164bc58e08782d10463f0a,
title = "An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm",
abstract = "The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).",
keywords = "CASE, Dynamic programming, Natural products, Polynomial time",
author = "Su, {Bo Han} and Shen, {Meng Yu} and Harn, {Yeu Chern} and Wang, {San Yuan} and Alioune Schurz and Chieh Lin and Lin, {Olivia A.} and Tseng, {Yufeng J.}",
year = "2017",
month = "11",
day = "15",
doi = "10.1186/s13321-017-0244-9",
language = "English",
volume = "9",
journal = "Journal of Cheminformatics",
issn = "1758-2946",
publisher = "Chemistry Central",
number = "1",

}

TY - JOUR

T1 - An efficient computer-aided structural elucidation strategy for mixtures using an iterative dynamic programming algorithm

AU - Su, Bo Han

AU - Shen, Meng Yu

AU - Harn, Yeu Chern

AU - Wang, San Yuan

AU - Schurz, Alioune

AU - Lin, Chieh

AU - Lin, Olivia A.

AU - Tseng, Yufeng J.

PY - 2017/11/15

Y1 - 2017/11/15

N2 - The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).

AB - The identification of chemical structures in natural product mixtures is an important task in drug discovery but is still a challenging problem, as structural elucidation is a time-consuming process and is limited by the available mass spectra of known natural products. Computer-aided structure elucidation (CASE) strategies seek to automatically propose a list of possible chemical structures in mixtures by utilizing chromatographic and spectroscopic methods. However, current CASE tools still cannot automatically solve structures for experienced natural product chemists. Here, we formulated the structural elucidation of natural products in a mixture as a computational problem by extending a list of scaffolds using a weighted side chain list after analyzing a collection of 243,130 natural products and designed an efficient algorithm to precisely identify the chemical structures. The complexity of such a problem is NP-complete. A dynamic programming (DP) algorithm can solve this NP-complete problem in pseudo-polynomial time after converting floating point molecular weights into integers. However, the running time of the DP algorithm degrades exponentially as the precision of the mass spectrometry experiment grows. To ideally solve in polynomial time, we proposed a novel iterative DP algorithm that can quickly recognize the chemical structures of natural products. By utilizing this algorithm to elucidate the structures of four natural products that were experimentally and structurally determined, the algorithm can search the exact solutions, and the time performance was shown to be in polynomial time for average cases. The proposed method improved the speed of the structural elucidation of natural products and helped broaden the spectrum of available compounds that could be applied as new drug candidates. A web service built for structural elucidation studies is freely accessible via the following link (http://csccp.cmdm.tw/).

KW - CASE

KW - Dynamic programming

KW - Natural products

KW - Polynomial time

UR - http://www.scopus.com/inward/record.url?scp=85034565912&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034565912&partnerID=8YFLogxK

U2 - 10.1186/s13321-017-0244-9

DO - 10.1186/s13321-017-0244-9

M3 - Article

VL - 9

JO - Journal of Cheminformatics

JF - Journal of Cheminformatics

SN - 1758-2946

IS - 1

M1 - 57

ER -