Dear all,
I have a BAM
file and I want to count how many reads have the following 4 patterns at chromosome region around 250bp:
AAAAAAAGAAAAA
AAAAAAAGAAAAAA
AAAAAAAAGAAAAAA
AAAAAAAAGAAAAA
I have downloaded ASCIIgenome
software and tried something with samtools mpileup
. But I don't know how to do what I am looking for.
I would do it in this way:
1) extract your reads from this region Extract Reads From A Bam File That Fall Within A Given Region
2) transform BAM to FASTA
3) write a python script that mechanically counts these patterns from FASTA, should be 20 lines.
i don't understand that sentence.
I edited the question.
that's still not clear. What is the 'pattern' ? how is it different from running
grep -E '(AAAAAAAGAAAAA|AAAAAAAGAAAAAA|AAAAAAAAGAAAAAA|AAAAAAAAGAAAAA)'
I am looking for A indels near 250 bp of the chromosome region. So I want to count how many reads support each of the 4 patterns in order to identify how many do have indesl and so on.
Would something like this work?