N as ALT allele
0
1
Entering edit mode
10.1 years ago
hellbio ▴ 520
Hi,

I have performed variant calling with samtools. For 
some reason some of the variants have N in ALT column as shown below:

chr21    29989187    .    A    AN    96.50    60    0    46    46 1

Could someone help to understand what it means?
SNP • 2.9k views
ADD COMMENT
0
Entering edit mode

My first guess would be that its a 1 bp insertion but samtools couldn't display the base because all the supporting reads have uncalled base (N) at that position. This can be some sequencing error. Can you elaborate what does other values mean ? 96.50 seems to be the Variant QUAL. What are the values after that? Also, did you process this file through varFilter ?

ADD REPLY
0
Entering edit mode

Thanks for responding.

Yes, it is a 1 bp insertion and below are the commands used to generate this call.

samtools mpileup -ABugf ref.fa -d 1000000 input | bcftools view -bvcg - > raw.vcf
bcftools view raw.vcf | vcfutils.pl varFilter -D 1000000 > out.vcf
Chromosome       Position        SNPid   Reference       Alternate       QUAL    MQ      Ref_depth       Alt_depth      Total_depth
chr21    29989187    .    A    AN    96.50    60    0    46    46

So, could it be because of sequencing error? However, this variant got annotated as in UTR_3_PRIME region (dog genome).

ADD REPLY
0
Entering edit mode

This is interesting. This is a homozygous mutation supported by 46 reads with MQ of 60. Can you tell me if reads supporting this variants come from both the strands ? OR can you paste the info related to the following tag:

PV4 P-values for 1) strand bias (exact test); 2) baseQ bias (t-test); 3) mapQ bias (t); 4) tail distance bias (t)

ADD REPLY
0
Entering edit mode

Surprisingly, PV4 values are not present for this variant.

ADD REPLY
0
Entering edit mode

can you pease show use the ouput of :

samtools mpileup -AB -f ref.fa -r chr21:29989186-29989188 input

input

ADD REPLY
0
Entering edit mode

Thanks for taking your time. Below is the output:

chr21    29989186    A    49    .$....,,,,..,,,,,,...,,,,..,,,,,...,,,...,..,,.,.,    B<BF<BIIFF<FFFBIIIIIII<IIIBIFI7IIIFFF7IIFII<BIBIB
chr21    29989187    A    49    ....,,,,..,,,,+1n,,...,,,,..,,,,,...,,,...,..,,.,.,^].    7BF<<FFFB7FF<0FIFIIIFBFIIBFFF<IIIFFB<IIFII7BIBIBB
chr21    29989188    A    48    ....,,,,..,,,,,,...,,,..,,,,,...,,,...,..,,.,.,.    7BF0<BFFB<BFBFFFFFIFFFII<FBF0IIIB<<FIIBIF0<I<IBB
ADD REPLY
0
Entering edit mode

Any further inference from the above output information?

ADD REPLY
0
Entering edit mode

Dear Pierre,

I have posted the output below. Could you infer what has been the reason from it. Thanks

ADD REPLY
0
Entering edit mode

Based on the pileup output you have posted there is only one read that is supporting insertion of a base with "n" nucleotide. This means that your

chr21    29989187    .    A    AN    96.50    60    0    46    46 1

output is wrong. I think there is some bug in the code which printed "AN" instead of "A". All the "." and "," in pileup output support that the reads at that position match the reference base which is "A". To conclude, this position is not a variant and somehow it printed "AN" instead of "A" in vcf file.

ADD REPLY

Login before adding your answer.

Traffic: 2661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6