Entering edit mode
7.9 years ago
igor
13k
It's easy to search for a particular sequence in a BAM file with samtools view | grep
. This works well with individual reads. However, what if you are trying to find a sequence in paired-end reads (meaning it's present in either one)?
Is there a better solution than to retrieve the read IDs that match your sequence of interest and then search for those read IDs to get both mates?
I don't really understand how it could work with 'samtools view -f 64 in.bam| grep -A 1' ?
As per the comment below, yes I assumed wrongly that the BAM file is sorted wrt to names. I edited my post.
Interesting. I guess that assumes read name sorting and that each read has a mate. However, if the sequence is found in both mates, the pair would get duplicated.
but 1) flag 64 is for first read in pair, so you will never find the second read in pair in the very next line. 2) reads of the same pair are not always on two successive lines.
Thank you Igor and Pierre! I use BAM files for unaligned data and forget BAM files can be aligned reads (sorted) as well :-P I edited my post.