Hello,
I am interested to filter out contaminant, adapter, mitochondrial DNA in 3 separate fasta file from Illumina paired end fastq file. So my question is how to use bowtie2 to umap on multiple files to produce a single fastaq unmap file which will filter 3 of these.If the alternative is to use bowtie2 to map , one at atime on these 3 files, then how to combine 3 umap files as they may have redundant reads.
I highly appreciate any feedback or if this question is answered previously , then please share the link. Thanks, Indrani
I am not sure what you mean by contaminant, but adapter and mitochondrial DNA should not map to the genome, so those will automatically be filtered out during alignment. Are you trying to remove or keep those?
I want to remove them. I also want to remove rRNA, polyA etc. So I want to run bowte2 single run which will produce 1 single fastq file with unmapped data filtering adpater, mitochondrial, polyA, rRNA. Thanks for your reply.
Indrani
If this is RNAseq data then you don't need to remove anything except the adapters (those you don't need to strictly remove either but it is a good idea). See this recent discussion: Removing rRNA and tRNA sequences using GTF files
Thanks for your reply.