Question

Ribo-Seq Analysis

4

Entering edit mode

21 months ago

phoenix.sum13 ▴ 90

Hi everyone,

What are some of the latest/advanced tools for ribo-seq analysis? If anyone here with experience in ribo-seq can guide me through or recommend me some tutorials that would be really helpful. I'm particularly interested in annotation of small ORFs (smORFs) in my data.

Thank you

ribo-seq ribosome profiling • 2.5k views

ADD COMMENT • link 20 months ago by phoenix.sum13 ▴ 90

score 1 · Answer 1 · 2023-02-23

1

Entering edit mode

21 months ago

Jack Tierney ▴ 410

Hey,

Hopefully this is somewhat helpful. This paper explains why ORF calling is still a very variable process that depends heavily on your data quality. Nonetheless, you can still get a lot of information on non-canonical translation from running these tools. RibORF, PRICE, RiboTricer are all popular. In my opinion it is important to verify ORF calls manually by looking at the subcodon profiles for the data. Trips-Viz has ORF calling via a user interface that makes this relatively easy as it provides direct links to the translated ORFs and allows you to compare calls with data from publicly available datasets.

I have a roughly curated list of other tools that offer ORF calling here.

ADD COMMENT • link 21 months ago by Jack Tierney ▴ 410

1

Entering edit mode

Hey, I got one more question. Can't we use rna-seq pipeline on ribo-seq data to annotate the smORFs? I know this is not recommended but I'm struggling to summarize what would be the problems/challenges in doing so?

ADD REPLY • link 21 months ago by phoenix.sum13 ▴ 90

1

Entering edit mode

Although Ribo-Seq and RNA-Seq pipelines are often quite similar there are specific aspects of Ribo-Seq data such as 3-nt periodicity that are key to most ORF prediction algorithms that you would lose if you naively used an RNA-Seq pipeline. Some parameters in your pipeline might also need to be changed to account for the short read length that we have in Ribo-Seq.

ADD REPLY • link 21 months ago by Jack Tierney ▴ 410

1

Entering edit mode

Hi again, thank you for these excellent sources. Please correct me if I'm wrong. Based on the sources you have sent, I observed that the pipeline for Ribo-Seq and RNA-Seq is quite similiar till the annotation part with some minor changes.

Since, I have a lot of samples to analyze (from different-different studies), is it okay to do the mapping and read-counts of them all first, check the annotation for the RNAs whose ORFs I'm interested in (I have a list), then go for ORF calling/identification by skipping the samples where I couldn't find the RNAs of my list?

This may sound silly but I'm afraid if I skip something important in an attempt to save some time. Also, would you recommend using hisat2 for the mapping of ribo-seq data? This is a part of my M.Sc. dissertation. So, thank you once again for guiding me through.

ADD REPLY • link 20 months ago by phoenix.sum13 ▴ 90

1

Entering edit mode

Hey, if I were you I would probably set up a pipeline that handles each study in turn. The main thing to prioritise here is that the provenance of each file is recorded. Meaning you know exactly what was carried out for to produce each output file.

I am not quite sure what you mean by the annotation part changing. But a typical workflow is preprocess > QC > Align > QC > downstream analysis (eg ORF calling). So that kind of fits what you have described (?).

HISAT2 or STAR really. STAR appears to be better if you have the computational resources to run it. However, I don't have a benchmark that I can point you to.

ADD REPLY • link 20 months ago by Jack Tierney ▴ 410

1

Entering edit mode

Sorry, there was a mistake. I meant both pipelines are similar till the mapping step. I will be doing a benchmark as well. Besides sequencing depth, read length, 3nt- periodicity, near-cognate start codons, are there anymore crucial factors to be careful about in ribo-seq for non-canonical ORFs? Thank you for your guidance once again.

ADD REPLY • link 20 months ago by phoenix.sum13 ▴ 90

1

Entering edit mode

This all depends on the question you are looking to investigate. For example, typically in a Ribo-Seq pipeline there is an rRNA removal step that isn't common in RNA-Seq workflows as far as I am aware. However, these rRNA reads can be used to investigate ribosome heterogeneity so obviously you would not discard rRNA reads in this instance.

Although there are many tools that carry out ORF detection using Ribo-Seq data their results still vary a lot. Depending on the algorithm you will be using there will be different factors that make varying degrees of impact. I would recommend you only carry out ORF detection using read lengths with good 3nt periodicity but this is a trade off as you will reduce the level of coverage as you discard certain read lengths. To remedy this it is common to aggregate datasets when detecting novel ORFs.

Be sure to visually verify that your detected ORFs do indeed appear to be translated. Best of luck!

ADD REPLY • link 20 months ago by Jack Tierney ▴ 410

0

Entering edit mode

Thank you so much once again <3. I'll keep these points in mind while doing the analysis.