Hi all!
I'm trying to write a python script for counting reads at each position of a genomic region. I wanted to use the pysam pileup method but the output confuses me. In particular I seem to be getting way more pileup columns than I expected. If I understood it correctly I should get one column per position in the supplied region, however when running the code below this doesn't seem to be the case.
itr = bam_file.pileup("chr1",15216562,15216662)
counter = 0
for x in itr:
counter += 1
The region is 100bp but the counter shows a value of 65531 after iterating over all pileup columns. Why does the counter not show 100? I'm probably misunderstanding something so very grateful for help with figuring this out!
Thanks!
If the reads/contigs are long enough (it can be an assembly file), the number can be so large like that.