Hi, I have 2 bam files from which the unmapped sequences have to be extracted and aligned against each other to identify unaligned sequences. The purpose is to identify the presence of a contamination in one sample by comparing with other.
Now I have fasta sequences of unmapped reads which am planning to blast against each other considering one as subject and other as query. Both files are of about ~3 gb size. But this is getting killed in my server which has 64 gb ram and 6 tb space.
Can somebody suggest how I can do this effectively in the available ram and server space ? I read about splitting the fasta files but couldn't find a proper tutorial. Hope someone would guide me on this.
Thank you.