Hello community,
I have now filtered out all reads that aligned to mtDNA and to unclassified contigs (not being part of defined chromosomes). I realized however that I also have to do the indel re-alignment, and I think that I should have done this before filtering. I am not sure though, would it be too different doing it before or after? I imagine, that in principle it should not make any difference.
I would be thankful for some explanation.
Thank you!
Good summer!
Hi,
Thank you for your answer, in fact I have the before and after bam files (even though it does occupy quite a bit of space). But what cons do you mean? I am removing these reads so whatever variants I find downstream do not belong to the mtDNA or the unclassified contigs. You know of ways to do this later in the pipeline? Do you convert to some other format?
Thank you,
Edit: Ok, I have found this: Should I Remove The Unmapped Reads From My Bam ?. I understand your point
If you kept all the reads, you only need to keep one copy. My general bioinformatics practice is to keep everything and doing the filtering as later as possible. In that case, when you want to change some steps and redo any analysis, you don't have to redo everything again.