IRIS

A method for predicting ensembles of In vivo RNA secondary structures using PARIS data (GitHub Link)

RNA secondary structures play a pivotal role in posttranscriptional regulation and the functions of non-coding RNAs, yet in vivo RNA secondary structures remain enigmatic. PARIS (Psoralen Analysis of RNA Interactions and Structures) is a recently developed high-throughput sequencing-based approach that enables direct capture of RNA duplex structures in vivo. However, the existence of incompatible, fuzzy pairing information obstructs the integration of PARIS data with the existing tools in the construction of RNA secondary structure models at the single base resolution. Here, we introduce IRIS, a method for predicting RNA secondary structure ensembles based on PARIS data. IRIS generates a large set of candidate RNA secondary structure models under the guidance of redistributed PARIS reads and then uses a Bayesian model to identify the optimal ensemble, according to both thermodynamic principles and PARIS data. We verified the predicted ensembles based on the evidence from evolutionary conservation and consistency with other experimental RNA structural data.

Usage

Import the IRIS API into Python scripts

We strongly recommend that users use the IRIS API by writing some simple Python scripts. Here, we prepare three Jupyter notebooks that utilize the IRIS API as examples. If JupyterLab is installed on the system (generally, JupyterLab will be automatically installed when setting up a Python environment using Anaconda), users can make full use of the following notebooks interactively and can freely and flexibly change the codes.

Run IRIS in the command-line interface (IRIS_CLI.py)

We also wrap IRIS as a traditional command-line tool. The usage is shown below.

python IRIS_CLI.py [-h] -i [FASTA file] -r [BAM file] -o [output file] -K [int] [-C [int]] [-F [float]] [--krange [...]]

Optional arguments

An example of executing IRIS CLI using default parameters.

python IRIS_CLI.py -i sample/sample.fa -r sample/sample.bam -o sample/sample.out -K 2

The parameters can be specified by the following command.

python IRIS_CLI.py -i sample/sample.fa -r sample/sample.bam -o sample/sample.out -K 2 -C 100 -F 0.75 --krange 4 5 6 7

Requirements

IRIS is implemented in Python 3. We suggest users to install the Python 3.x environment using Anaconda. Then, the following required Python libraries can be easily installed using conda.

Library Version Installation
viennarna >= 2.4.14 conda install -c bioconda viennarna
pysam >= 0.15.3 conda install -c bioconda pysam
biopython >= 1.74 conda install biopython
numpy >= 1.17.2 conda install numpy
scipy >= 1.3.1 conda install scipy
scikit-learn >= 0.21.3 conda install scikit-learn
matplotlib >= 3.1.1 conda install matplotlib
pillow >= 6.1.0 conda install pillow
tabulate >= 0.8.5 conda install tabulate