Entering edit mode
4.7 years ago
O.rka
▴
740
I've seen a lot of references to removing rRNA with bbduk.sh but I haven't seen specific commands that should be used. Is there a recommended parameter setting that could be used? Would this change with respect to read length?
My plan was to do the following:
- kneaddata to run trimmomatic and remove human contamination
- bbduk.sh to bin out rRNA (might do something w/ this later)
- rnaspades.py to assemble the transcripts
I'm very familiar with steps 1 and 3 but I haven't experimented much with bbduk.sh. I'm very impressed with the rest of the bbsuite so I'm looking forward to adding this into the pipeline.
perhaps this would help: http://seqanswers.com/forums/showthread.php?t=42776&page=17
Alternatively, you could use sortmerna
@O.rka: Take a look at C: How to identify 16s sequences from binning data(contigs)? in addition (specifically 2nd comment from @Brian in answer there).
If you are working with a eukaryotic sample then you may want to use NCBI's ITS, LSU and SSU rRNA sequences they make available in pre-made blast indexes directory. You will need to extract fasta files and make k-mers from them.