Allele count for SNPs
1
1
Entering edit mode
8.3 years ago
apocalyps52 ▴ 40

Hello,

I'm trying to call SNPs from a .SAM file.

I've checked the output format of .vcf files but I couldn't find what I need.

I need to extract the counts of alleles for each SNP ID.

Something like this :

rs99999999 A:123, T:0, C:345, G:0

I'm a beginner in the field so sorry if my question looks simple but I cannot find the answer.

Can anyone tell me how to do it ?

Thanks in advance.

SNP snp sequencing next-gen • 5.5k views
ADD COMMENT
1
Entering edit mode

Could you provide some more information? For example what program did you use and which parameters?

ADD REPLY
0
Entering edit mode

Thanks Guillaume an Noushin for detailed responses. Now, I think I can find what I need.

ADD REPLY
7
Entering edit mode
8.3 years ago
guillaume.rbt ★ 1.0k

You won't find this kind of count directly into a VCF file. But there is the allelic depth for the reference and the alternative alleles.

For example for this SNP:

chr_1 21682 . T C 150.0 . AC=1;AF=1.00;AN=1;DP=4;FS=0.000;MLEAC=1;MLEAF=1.00;MQ=56.44;QD=31.06;SOR=3.258 GT:AD:DP:GQ:PL 1:0,4:4:99:180,0

The reference allele is T and the alternate allele is C. And if you look at the AD (allelic depth) field, you will find that there is 0 reads supporting the reference, and 4 reads supporting the alternate allele.

You can use a tool like SnpSift extractFields to get this field.

ADD COMMENT
2
Entering edit mode

Adding to Guillaume's response, you can also use samtools mpileup to get the full set of alleles from all the reads covering a genomic position (http://samtools.sourceforge.net/mpileup.shtml).

I have written a python script that converts the mpileup output to a table containing read counts.

ADD REPLY
0
Entering edit mode

Thanks Noushin, your python script seems to simplify the process. I will definitely try it.

ADD REPLY

Login before adding your answer.

Traffic: 2438 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6