I have Ribo-seq
data (also aligned them) and trying to isolate the in-frame reads. do you know how I can do that?
I have Ribo-seq
data (also aligned them) and trying to isolate the in-frame reads. do you know how I can do that?
Hello Sara!
At this moment, RIBO-seq tools allow you to work on your data to study the Kmer size repartition, the periodicity etc... but no tools are available to directly deal with data.
I am currently finishing a library to do all this stuff for people working with Ribo-seq (I am working for 2 years essentially with RIBO-seq)! :)
So, at this time, you'll need to implement something to do it. Basically, to extract in-frame reads you have to:
Global trimming (adapter, quality, rRNA)
Reduce the reads to only 1 position (the P-site of the Ribosome) or an other site depending on your study. Be careful to choose this site accordingly with the enzyme you used to digest the RPFs. Also, depending on your organism, you should count from the 5' or the 3' of your RPFs. I recommend you to use some already written tools to detect P-site position from your filtered reads.
Getting the in frame position from a GFF3 file (CDSes only, and I strongly suggest to select transcripts, depending on the organism you are working with). If you are working with proka or Yeast, should be easy. With human or complex genomes, you should take a look to the most expressed transcripts and/or APPRIS classification of transcripts (or support level).
Extracting only the reads that the reduced position matches the ATG frame for each transcript
So, this is lot's of work depending the organism (I have started writing the library 1 year ago). I don't have the library ready to be released (it should be ready for November - December), but if you need help to implement this feature feel free to ask! But with the 4 points I mentioned you should be able to do it easily using samtools library.
Feel free to ask me more details if you want to implement it.
Hope this helps!
Best, glihm
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
just curious: what is an in-frame read ? I know what is a translation frame but I wonder how a read can be 'in-frame' ?
When dealing with RIBO-seq data, we don't take the whole read in account. Only 1 position, which usually correspond to the first nucleotide in the P-site of the Ribosome. So, as you are working with only 1 position to characterize your reads (RPF: ribosome protected fragment), you can see at the nucleotide resolution which read is in the F1, F2 or F3 of the CDSes. F1 corresponding to the ATG frame, the in-frame reads.
thanks !