While genome sequencing is now commonplace, genome annotation and functional characterisation of components of the genome remains an enormous challenge. Like small regulatory RNAs, short open reading frames (sORFs) often reside in ‘non-coding’ regions of the genome that have long been considered to be ‘junk DNA . Translatable sORFs of less than 100 amino acids are extremely difficult to predict from genome sequences as the number of potential ORFs increases exponentially as the potential peptide lengths get smaller . This challenge is made more complicated by mounting evidence that these short proteins do not always comply with genetic convention, and are frequently encoded by short ORFs that use a translation start codon other than AUG.
References: 1. Waterhouse PM, Hellens RP (2015) Plant biology: Coding in non-coding RNAs. Nature 520: 41-42. 2. Hellens RP, Brown CM, Chisnall MA, Waterhouse PM, Macknight RC (2015) The Emerging World of Small ORFs. Trends Plant Sci. 3. Laing WA, Martinez-Sanches M, Wright M, Bulley S, Brewster D, et al. (2015) A non-canonical upstream open reading frame is essential for feedback regulation of ascorbate biosynthesis The Plant Cell 27: 772-786.