I am VERY new to bioinformatics and attempting to use sortmerna to filted rRNA sequences out of my fastq files. For some reason, regardless of using --paired_out or --paired_in I cannot manage to get/find the files with the rRNA filtered out. I have SRA####_1 and SRA###_2. Any adivce?
my current input:
path/to/bioinformatics/ sortmerna -ref file/path/rRNA_databases_v4/smr_v4.3_default_db.fasta -reads /path/to/SRR####_1.fastq -reads /path/to/SRR####_2.fastq -a 48 -v --paired_in --fastx --aligned path/to/sortmernaOutput/alignedrRNA --other path/to/sortmernaOutput/cleanedrRNA
I cannot currently provide a trace as I did not realize I should record this initially and due to struggling, I'm using ribopicker which works great but takes between 2-4 hours per file and I cannot multithread it and is very memory inefficient. If it would be helpful I would be more than happy to provide one!
Biostars has already been very helpful to me so thank you so much and thank you again in advance!
Thank you for the response! I will try that! I suspect I have at least some rRNA has ribopicker is consistently removing about 10% of my file. bbduk may very well give me something to compare it to!
Generally rRNA counts (if you have those gene models in your GTF) can be ignored when doing RNAseq analysis. They counts are useful only to make sure that ribodepletion has worked or not. So you could move on with the data analysis, without worrying about this.