Entering edit mode
5.2 years ago
aboyd003
•
0
I am attempting to count the number of reads in my bam at a list of specific coordinates. I am using samtools. Queryid specifies chromosome and mylookup specifies the coordinate. The coordinate is a single position and not a feature. A GFF file is not available to run standard programs like featureCount. The resulting count is 663504 which is more than 661089. Does anyone have ideas about this?
samtools view -c $FILE "$queryid":"$mylookup"-"$mylookup"
featureCounts
accepts custom position lists in its SAF format, check its manual. From this you can quantify any given intervals in the genome.I'm unsure of precisely what you want, but chr:pos-pos returns all alignments that overlap pos and not all alignments starting at pos.
Your title implies you wanted the latter. If this is the case, I don't think such a tool exists but you could use mpileup to get a single coordinate and then count the '^' symbols. Eg:
The regexp replaces anything not "^" with "" ([^x] is not x, so [^^] is not ^) and then counts the number of remaining symbols. "-Q0" is needed to avoid removal of low quality bases.
That may still remove duplicates unless other mpileup options are used? I forget now how it works.
Alternatively just use normal samtools view and count the number of lines exactly starting at the desired pos.
5