Question

Trouble using sortmerna: How to get nonrRNA output from paired fastq files

0

Entering edit mode

29 days ago

Ezekiel ▴ 10

I am VERY new to bioinformatics and attempting to use sortmerna to filted rRNA sequences out of my fastq files. For some reason, regardless of using --paired_out or --paired_in I cannot manage to get/find the files with the rRNA filtered out. I have SRA####_1 and SRA###_2. Any adivce?

my current input:

path/to/bioinformatics/ sortmerna -ref file/path/rRNA_databases_v4/smr_v4.3_default_db.fasta -reads /path/to/SRR####_1.fastq -reads /path/to/SRR####_2.fastq -a 48 -v  --paired_in  --fastx --aligned path/to/sortmernaOutput/alignedrRNA --other path/to/sortmernaOutput/cleanedrRNA

I cannot currently provide a trace as I did not realize I should record this initially and due to struggling, I'm using ribopicker which works great but takes between 2-4 hours per file and I cannot multithread it and is very memory inefficient. If it would be helpful I would be more than happy to provide one!

Biostars has already been very helpful to me so thank you so much and thank you again in advance!

rnatranscripts sortmerna fastq • 379 views

ADD COMMENT • link updated 29 days ago by GenoMax 149k • written 29 days ago by Ezekiel ▴ 10

score 0 · Answer 1 · 2025-01-31

0

Entering edit mode

29 days ago

GenoMax 149k

I cannot manage to get/find the files with the rRNA filtered out.

It is certainly possible that there is no rRNA in the data you have. So not getting anything may actually be a good result.

You could try the bbduk.sh (from BBMap suite) method described in How well was the rRNA depletion for RNA-Seq experiments if you want to try an alternate method.

ADD COMMENT • link 29 days ago by GenoMax 149k

0

Entering edit mode

Thank you for the response! I will try that! I suspect I have at least some rRNA has ribopicker is consistently removing about 10% of my file. bbduk may very well give me something to compare it to!

ADD REPLY • link 29 days ago by Ezekiel ▴ 10

0

Entering edit mode

Generally rRNA counts (if you have those gene models in your GTF) can be ignored when doing RNAseq analysis. They counts are useful only to make sure that ribodepletion has worked or not. So you could move on with the data analysis, without worrying about this.

ADD REPLY • link 29 days ago by GenoMax 149k