Hi all,
I have paired-end fastq files from a single-cell RNA seq experiment. The forward files consist of barcode reads (no usable sequence information) while the reverse reads contain transcript sequences. I'm looking for an efficient way to filter out rRNA reads, but it's a little tricky considering I can't align the forward reads. Is there an efficient way of doing this either within an aligner or with a quick downstream script that will result in rRNA-filtered and orphanless paired end (non-interleaved) fastq files?
Thanks.
Ah, I see that is quite simple! So the correct samtools flag be 9 (read paired (0x1) & mate unmapped (0x8))?
Edit: Okay, I think it should be 13 (read paired (0x1) & read unmapped (0x4) & mate unmapped (0x8)) since the read 1 is the "read" and read 2 is the "mate." Also, I want the mates which are unmapped.
Edit 2: If anyone is interested, here is what I used to convert sam output to bam, filter only for pairs where reverse read non-mapped, and converted back to two non-interleaved fastqs in a single command: