Correlation of RNA Secondary Structure Statistics with Thermodynamic Stability and Applications to Folding

Johnny C. Wu, David P. Gardner, Stuart Ozer, Robin R. Gutell, Pengyu Ren

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.

Original languageEnglish
Pages (from-to)769-783
Number of pages15
JournalJournal of Molecular Biology
Volume391
Issue number4
DOIs
Publication statusPublished - Aug 28 2009
Externally publishedYes

Fingerprint

Thermodynamics
RNA
Base Pairing
RNA Folding
Osmolar Concentration
Databases
Temperature
Proteins

Keywords

  • comparative analysis
  • RNA folding
  • statistical potentials
  • thermodynamic stability

ASJC Scopus subject areas

  • Molecular Biology

Cite this

Correlation of RNA Secondary Structure Statistics with Thermodynamic Stability and Applications to Folding. / Wu, Johnny C.; Gardner, David P.; Ozer, Stuart; Gutell, Robin R.; Ren, Pengyu.

In: Journal of Molecular Biology, Vol. 391, No. 4, 28.08.2009, p. 769-783.

Research output: Contribution to journalArticle

Wu, Johnny C. ; Gardner, David P. ; Ozer, Stuart ; Gutell, Robin R. ; Ren, Pengyu. / Correlation of RNA Secondary Structure Statistics with Thermodynamic Stability and Applications to Folding. In: Journal of Molecular Biology. 2009 ; Vol. 391, No. 4. pp. 769-783.
@article{86bf4e7c6a324d8d9db229785c319c63,
title = "Correlation of RNA Secondary Structure Statistics with Thermodynamic Stability and Applications to Folding",
abstract = "The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.",
keywords = "comparative analysis, RNA folding, statistical potentials, thermodynamic stability",
author = "Wu, {Johnny C.} and Gardner, {David P.} and Stuart Ozer and Gutell, {Robin R.} and Pengyu Ren",
year = "2009",
month = "8",
day = "28",
doi = "10.1016/j.jmb.2009.06.036",
language = "English",
volume = "391",
pages = "769--783",
journal = "Journal of Molecular Biology",
issn = "0022-2836",
publisher = "Academic Press Inc.",
number = "4",

}

TY - JOUR

T1 - Correlation of RNA Secondary Structure Statistics with Thermodynamic Stability and Applications to Folding

AU - Wu, Johnny C.

AU - Gardner, David P.

AU - Ozer, Stuart

AU - Gutell, Robin R.

AU - Ren, Pengyu

PY - 2009/8/28

Y1 - 2009/8/28

N2 - The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.

AB - The accurate prediction of the secondary and tertiary structure of an RNA with different folding algorithms is dependent on several factors, including the energy functions. However, an RNA higher-order structure cannot be predicted accurately from its sequence based on a limited set of energy parameters. The inter- and intramolecular forces between this RNA and other small molecules and macromolecules, in addition to other factors in the cell such as pH, ionic strength, and temperature, influence the complex dynamics associated with transition of a single stranded RNA to its secondary and tertiary structure. Since all of the factors that affect the formation of an RNAs 3D structure cannot be determined experimentally, statistically derived potential energy has been used in the prediction of protein structure. In the current work, we evaluate the statistical free energy of various secondary structure motifs, including base-pair stacks, hairpin loops, and internal loops, using their statistical frequency obtained from the comparative analysis of more than 50,000 RNA sequences stored in the RNA Comparative Analysis Database (rCAD) at the Comparative RNA Web (CRW) Site. Statistical energy was computed from the structural statistics for several datasets. While the statistical energy for a base-pair stack correlates with experimentally derived free energy values, suggesting a Boltzmann-like distribution, variation is observed between different molecules and their location on the phylogenetic tree of life. Our statistical energy values calculated for several structural elements were utilized in the Mfold RNA-folding algorithm. The combined statistical energy values for base-pair stacks, hairpins and internal loop flanks result in a significant improvement in the accuracy of secondary structure prediction; the hairpin flanks contribute the most.

KW - comparative analysis

KW - RNA folding

KW - statistical potentials

KW - thermodynamic stability

UR - http://www.scopus.com/inward/record.url?scp=68149132469&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=68149132469&partnerID=8YFLogxK

U2 - 10.1016/j.jmb.2009.06.036

DO - 10.1016/j.jmb.2009.06.036

M3 - Article

VL - 391

SP - 769

EP - 783

JO - Journal of Molecular Biology

JF - Journal of Molecular Biology

SN - 0022-2836

IS - 4

ER -