Principles for Understanding the Accuracy of SHAPE-Directed RNA Structure Modeling

Creative Commons License

Leonard C. W., Hajdin C. E., Karabiber F., Mathews D. H., Favorov O. V., Dokholyan N. V., ...More

BIOCHEMISTRY, vol.52, no.4, pp.588-595, 2013 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 52 Issue: 4
  • Publication Date: 2013
  • Doi Number: 10.1021/bi300755u
  • Journal Name: BIOCHEMISTRY
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.588-595
  • Yıldız Technical University Affiliated: No


Accurate RNA structure modeling is an important, incompletely solved, challenge. Single-nucleotide resolution SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) yields an experimental measurement of local nucleotide flexibility that can be incorporated as pseudo-free energy change constraints to direct secondary structure predictions. Prior work from our laboratory has emphasized both the overall accuracy of this approach and the need for nuanced interpretation of modeled structures. Recent studies by Das and colleagues [Kladwang, W., et al. (2011) Biochemistry SO, 8049; Nat. Chem. 3, 954], focused on analyzing six small RNAs, yielded poorer RNA secondary structure predictions than expected on the basis of prior benchmarking efforts. To understand the features that led to these divergent results, we re-examined four RNAs yielding the poorest results in this recent work: tRNA(Phe), the adenine and cyclic-di-GMP riboswitches, and 5S rRNA. Most of the errors reported by Das and colleagues reflected nonstandard experiment and data processing choices, and selective scoring rules. For two RNAs, tRNA(Phe) and the adenine riboswitch, secondary structure predictions are nearly perfect if no experimental information is included but were rendered inaccurate by the SHAPE data of Das and colleagues. When best practices were used, single-sequence SHAPE-directed secondary structure modeling recovered similar to 93% of individual base pairs and >90% of helices in the four RNAs, essentially indistinguishable from the results of the mutate-and-map approach with the exception of a single helix in the 5S rRNA. The field of experimentally directed RNA secondary structure prediction is entering a phase focused on the most difficult prediction challenges. We outline five constructive principles for guiding this field forward.