We have performed a computational comparative analysis of six small non-coding RNA (sRNA) families in α-proteobacteria. Members of these families were first identified in the intergenic regions of the nitrogen-fixing endosymbiont S. meliloti by a combined bioinformatics screen followed by experimental verification. Consensus secondary structures inferred from covariance models for each sRNA family evidenced in some cases conserved motifs putatively relevant to the function of trans-encoded base-pairing sRNAs i.e., Hfq-binding signatures and exposed anti Shine-Dalgarno sequences. Two particular family models, namely αr15 and αr35, shared own sub-structural modules with the Rfam model suhB (RF00519) and the uncharacterized sRNA family αr35b, respectively. A third sRNA family, termed αr45, has homology to the cis-acting regulatory element speF (RF00518). However, new experimental data further confirmed that the S. meliloti αr45 representative is an Hfq-binding sRNA processed from or expressed independently of speF, thus refining the Rfam speF model annotation. All the six families have members in phylogenetically related plant-interacting bacteria and animal pathogens of the order of the Rhizobiales, some occurring with high levels of paralogy in individual genomes. In silico and experimental evidences predict differential regulation of paralogous sRNAs in S. meliloti 1021. The distribution patterns of these sRNA families suggest major contributions of vertical inheritance and extensive ancestral duplication events to the evolution of sRNAs in plant-interacting bacteria.
Distinct families of cis-acting RNA replication elements epsilon from hepatitis B viruses
The hepadnavirus encapsidation signal, epsilon (ε), is an RNA structure located at the 5′ end of the viral pregenomic RNA. It is essential for viral replication and functions in polymerase protein binding and priming. This structure could also have potential regulatory roles in controlling the expression of viral replicative proteins. In addition to its structure, the primary sequence of this RNA element has crucial functional roles in the viral lifecycle. Although the ε elements in hepadnaviruses share common critical functions, there are some significant differences in mammalian and avian hepadnaviruses, which include both sequence and structural variations.
Here we present several covariance models for ε elements from the Hepadnaviridae. The model building included experimentally determined data from previous studies using chemical probing and NMR analysis. These models have sufficient similarity to comprise a clan. The clan has in common a highly conserved overall structure consisting of a lower-stem, bulge, upper-stem and apical-loop.
The models differ in functionally critical regions—notably the two types of avian ε elements have a tetra-loop (UGUU) including a non-canonical UU base pair, while the hepatitis B virus (HBV) epsilon has a tri-loop (UGU). The avian epsilon elements have a less stable dynamic structure in the upper stem. Comparisons between these models and all other Rfam models, and searches of genomes, showed these structures are specific to the Hepadnaviridae. Two family models and the clan are available from the Rfam database.
Liver-specific microRNA-122: Biogenesis and function
microRNA-122 (miR-122) was one of the first examples of a tissue-specific miRNA. It is highly expressed in liver, where it constitutes 70% of the total miRNA pool. miR-122 expression is specific to the vertebrate lineage, where the sequence of the mature miRNA is completely conserved. miR-122 is a target for extensive study due to its association with cholesterol metabolism and hepatocellular carcinoma, and its important role in promoting hepatitis C virus (HCV) replication. This review will discuss the biogenesis and function of miR-122.
Transcription beyond borders has downstream consequences
The realization that non-coding RNAs and antisense transcription are pervasive in many genomes has emphasized our relatively poor understanding of what limits transcription and how initiation and termination are linked to processing and turnover of the RNA. In genomes where the density of genes is high it is clearly important to efficiently terminate transcription to prevent read-through into adjacent genes. In a recent paper published in PNAS, we showed that two RNA binding proteins in Arabidopsis thaliana, FCA and FPA, play important roles in limiting intergenic transcription in the A. thaliana genome. Their absence leads to transcriptional read-through over many kilobases (kb), which influences expression, and in some cases chromatin modifications, of associated genes.
HSP90 and the R2TP co-chaperone complex: Building multi-protein machineries essential for cell growth and gene expression
HSP90 (Heat Shock Protein 90) is an essential chaperone involved in the last folding steps of client proteins. It has many clients, and these are often recognized through specific adaptors. Recently, the conserved R2TP complex was identified as a key HSP90 co-chaperone. Current evidences indicate that the HSP90/R2TP system assembles multi-molecular protein complexes. Strikingly, these comprise basic machineries of gene expression: (1) nuclear RNA polymerases; (2) the snoRNPs, essential to produce ribosomes; and (3) mTOR Complex 1 and 2, which control translational activity and cell growth. Another important substrate is the telomerase RNP, required for continuous cell proliferation. We discuss here the assembly of RNA polymerases in bacteria and eukaryotes, the role of HSP90/R2TP in this process and in the assembly of snoRNPs and the PIKK family of TORC1 kinase. Finally, we speculate on the roles of R2TP as a master regulator of cell growth under normal or pathological conditions.
The DYW-class PPR protein MEF7 is required for RNA editing at four sites in mitochondria of Arabidopsis thaliana
In plant mitochondria and plastids, RNA editing alters about 400 and about 35 C nucleotides into Us, respectively. Four of these RNA editing events in plant mitochondria specifically require the PPR protein MEF7, characterized by E and DYW extension domains. The gene for MEF7 was identified by genomic mapping of the locus mutated in plants from EMS treated seeds. The SNaPshot screen of the mutant plant population identified two independent EMS mutants with the same editing defects as a corresponding T-DNA insertion line of the MEF7 gene. Although the amino acid codons introduced by the editing events are conserved throughout flowering plants, even the combined failure of four editing events does not impair the growth efficiency of the mutant plants. Five nucleotides are conserved between the four affected editing sites, but are not sufficient for specific recognition by MEF7 since they are also present at three other sites which are unaffected in the mutants.
Human RioK3 is a novel component of cytoplasmic pre-40S pre-ribosomal particles
Maturation of the 40S ribosomal subunit precursors in mammals mobilizes several non-ribosomal proteins, including the atypical protein kinase RioK2. Here, we have investigated the involvement of another member of the RIO kinase family, RioK3, in human ribosome biogenesis. RioK3 is a cytoplasmic protein that does not seem to shuttle between nucleus and cytoplasm via a Crm1-dependent mechanism as does RioK2 and which sediments with cytoplasmic 40S ribosomal particles in a sucrose gradient. When the small ribosomal subunit biogenesis is impaired by depletion of either rpS15, rpS19 or RioK2, a concomitant decrease in the amount of RioK3 is observed. Surprisingly, we observed a dramatic and specific increase in the levels of RioK3 when the biogenesis of the large ribosomal subunit is impaired. A fraction of RioK3 is associated with the non ribosomal pre-40S particle components hLtv1 and hEnp1 as well as with the 18S-E pre-rRNA indicating that it belongs to a bona fide cytoplasmic pre-40S particle. Finally, RioK3 depletion leads to an increase in the levels of the 21S rRNA precursor in the 18S rRNA production pathway. Altogether, our results strongly suggest that RioK3 is a novel cytoplasmic component of pre-40S pre-ribosomal particle(s) in human cells, required for normal processing of the 21S pre-rRNA.
Nucleolar disruption leads to the spatial separation of key 18S rRNA processing factors
Many chemotherapeutic drugs cause the downregulation of ribosome production and the disruption of nucleolar function. This stabilizes p53 and leads to either cell cycle arrest or apoptosis. It is not clear, however, how these agents cause nucleolar disruption and block ribosome production. The small subunit (SSU) processome, which has been primarily studied in yeast, is responsible for the processing of the 18S rRNA and assembly of the small ribosomal subunit. Here we have characterized the human homologs of seven SSU processome components. Furthermore, we have investigated the effects of three chemotherapeutic drugs, Actinomycin D (ActD), camptothecin (CPT) and 5,6-dichloro-1-β-D-ribofuranosylbenzimidazole (DRB) on the subcellular distribution of key SSU processome components and the formation of this processing complex. Interestingly, ActD- and DRB-treatment resulted in the majority of U3 small nucleolar RNP (snoRNP) localizing separately to other key components of the SSU processome. All three agents affected RNA polymerase I transcription, primarily at the level of elongation but only ActD resulted in a clear reduction in SSU processome levels. Taken together, our data indicate that different chemotherapeutic agents, each of which initiates a stress response and cause nucleolar disruption, have different effects on the formation and localization of the SSU processome.
Identifying complete RNA structural ensembles including pseudoknots
The close relationship between RNA structure and function underlines the significance of accurately predicting RNA structures from sequence information. Structural topologies such as pseudoknots are of particular interest due to their ubiquity and direct involvement in RNA function, but identifying pseudoknots is a computationally challenging problem and existing heuristic approaches usually perform poorly for RNA sequences of even a few hundred bases. We survey the performance of pseudoknot prediction methods on a data set of full-length RNA sequences representing varied sequence lengths, and biological RNA classes such as RNase P RNA, Group I Intron, tmRNA and tRNA. Pseudoknot prediction methods are compared with minimum free energy and suboptimal secondary structure prediction methods in terms of correct base-pairs, stems and pseudoknots and we find that the ensemble of suboptimal structure predictions succeeds in identifying correct structural elements in RNA that are usually missed in MFE and pseudoknot predictions. We propose a strategy to identify a comprehensive set of non-redundant stems in the suboptimal structure space of a RNA molecule by applying heuristics that reduce the structural redundancy of the predicted suboptimal structures by merging slightly varying stems that are predicted to form in local sequence regions. This reduced-redundancy set of structural elements consistently outperforms more specialized approaches.in data sets. Thus, the suboptimal folding space can be used to represent the structural diversity of an RNA molecule more comprehensively than optimal structure prediction approaches alone.
A domain-based model for predicting large and complex pseudoknotted structures
Pseudoknotted structures play important structural and functional roles in RNA cellular functions at the level of transcription, splicing and translation. However, the problem of computational prediction for large pseudoknotted folds remains. Here we develop a domain-based method for predicting complex and large pseudoknotted structures from RNA sequences. The model is based on the observation that large RNAs can be separated into different structural domains. The basic idea is to first identify the domains and then predict the structures for each domain. Assembly of the domain structures gives the full structure. The use of the domain-based approach leads to a reduction of computational time by a factor of about ~N2 for an N-nt sequence. As applications of the model, we predict structures for a variety of RNA systems, such as regions in human telomerase RNA (hTR), internal ribosome entry site (IRES) and HIV genome. The lengths of these sequences range from 200-nt to 400-nt. The results show good agreements with the experiments.
Drastic expression change of transposon-derived piRNA-like RNAs and microRNAs in early stages of chicken embryos implies a role in gastrulation
Recent studies have shown that endogenous small RNAs regulate a variety of biological processes during vertebrate development; however, little is known about the role of small RNAs in regulating developmental signaling pathways during early embryogenesis. In this study, we applied Illumina sequencing to characterize an unexpected endogenous small RNA catalog and demonstrated a dramatic transition from transposon-derived piRNA-like small RNAs (pilRNAs) to microRNAs (miRNAs) in pre- and post-gastrula chicken embryos. The comprehensive expression profile of chicken miRNAs at the pre- and post-gastrula stages revealed that most known and new miRNAs were dynamically regulated during development. In addition to embryonic stem cell-related miRNAs, Gene Ontology (GO) analysis showed that miRNAs enriched in early stage chicken embryos targeted multiple signal transduction pathways associated with the reproductive process and embryogenesis, including Wnt and TGF-β, which specifies the neural fate of blastodermal cells. Intriguingly, a large cohort of pilRNAs primarily derived from the active and most abundant transposable elements (TEs) were enriched in chicken stage X blastoderms. Within stage X blastoderms, pilRNAs were specifically localized to the primordial germ cells (PGCs), indicating their post-zygotic origin. Together, these findings imply a role for small RNAs in gastrulation in early stage chicken embryos.