Entering edit mode
7.7 years ago
zaidnab
•
0
I am trying to write a python script that, from a BAM files, calculates the error rate of the DNA sequencing based on the reference genome. I am brand new to bioinformatics, and I am very stuck. This is what I have so far:
import pysam
samfile = pysam.AlignmentFile("TruQ3_229.sorted.bam", "rb")
for pileupcolumn in samfile.pileup("chr1", 100, 120)
I have no idea what to do and where to continue. Any help is appreciated. Thanks!
What is your definition of error rate? Have you first tried summarizing what do you want to achieve? While posting a question, please try to provide as much as information as possible.
Hi zaidnab,
While not a full answer, you can find some ideas in my recent blog post: Getting the edit distance from a bam alignment: a journey. Let me know if you need further help.
I see you started to make pileups, but I believe you want a per-read error rate?