Nucletotide distribution, at each position, in a .sam/.bam file ?
1
1
Entering edit mode
10.3 years ago

Hi,

i'm trying to extract the nucleotide distribution, for each position, from a .sam/.bam file ?!

I don't look for the total depth of coverage (that can be done with GATK or samtools), but the depth of coverage for each nucleotide, at each position in my alignment file (bam/sam/...)

How can i do that ?

Do you know a tools that can do that ?

Thanks in advance for any answer/suggestion,

RĂ©mi

sam bam next-gen nucleotide distribution • 5.0k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
2
Entering edit mode
10.3 years ago

Have a look at pysamstat executed as:

pysamstats -f ref.fa --type variation_strand aln.bam > aln.var.txt

It will give the count of A, C, G, T, insertions and deletions at each position in the reference (is this what you are after?).

If you want to parse the 5th column of samtools mpileup yourself take care that it contains also the mapping qualities and the sequence of insertions and deletions. So just counting the occurrences of ACTG will give slightly incorrect results (I think the answer Pierre links to has this problem).

ADD COMMENT

Login before adding your answer.

Traffic: 986 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6