Hello
I think this is a silly question, nonetheless I need help with this issue :p
I have several bam files that I have obtained from some mapping procedures I have done with bwa, and I have retrieved the fastq sequences from these files.
So, I need to put all the fastq sequences in one file, but without repetitions. (If one read was a hit in all mappings I need to have it only once in the final file; if another appeared in just two mappings; again, I need it only once... you get the picture).
Thanks in advance.
grep read names from fastq_file_1 into file_list, then grep read names from fastq_file_2 and add them to the same file_list... do it for all your samples, remove duplicates, and extract the reads with seqtk.