bam file: unmapped but matching reads (samtools view -f 4)
0
0
Entering edit mode
3 months ago
Tommy • 0

for example, I have a raw sequence .fastq that is trimmed, QC-ed and bowtie2 mapped to a reference genome. to count the reads in the .bam file:

samtools view -c abc.bam (Print only the count of matching records)
12414891
 samtools view -c -f 4 abc.bam (unmapped reads)
191787
samtools view -c -F 4 abc.bam (mapped reads)
12223104

so the mapped reads + unmapped reads = matching records

what does it mean if a read is matching but not mapped?

Also the total read # is a .bam is clearly less than the total read number of QC-ed fastq, meaning there are a lot more unmapped reads in .fastq. what is the difference between unmapped reads in .bam and unmapped reads in fastq? (if I have a read of (T)150, will this be included by -f 4?)

samtools bam • 376 views
ADD COMMENT
0
Entering edit mode

@Ian Sudbery recently clarified this in: meaning of "primary alignment" in samtools

This is probably one point of confusion - a primary "alignment" isn't neccessary mapped - its just a line in the SAM file that doesn't have the seconardy or supplementary flag set. Since unmapped reads would havn't the secondary or supplmentary flag set, they also count as primary "alignments", even though they are not actually aligned!

ADD REPLY
0
Entering edit mode

read is matching but not mapped?

what do you mean with "matching"

a lot more unmapped reads in .fastq.

reads in fastq are not mapped. May be you're talking about secondary and supplementary alignments

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6