GATK_HaplotypeCaller: <NON_REF> .
2
7
Entering edit mode
9.8 years ago
iraun 6.2k

Hi there,

Could anyone explain me a little bit these two lines of my GVCF file?

Why I'm getting "NON REF" alternative allele? And why for the second line I only see END information and not the rest?

GL000192.1    546636    .    G    A,<NON_REF>    49.77    .    BaseQRankSum=1.026;ClippingRankSum=1.026;DP=4;MLEAC=1,0;MLEAF=0.500,0.00;MQ=49.02;MQ0=0;MQRankSum=-1.026;ReadPosRankSum=0.000    GT:AD:DP:GQ:PGT:PID:PL:SB    0/1:2,2,0:4:78:0|1:546636_G_A:78,0,87,84,93,177:1,1,1,1
GL000192.1    546637    .    G    <NON_REF>    .    .    END=546645    GT:DP:GQ:MIN_DP:PL    0/0:4:11:4:0,12,99

I know that NON_REF represents any possible alternative allele at this location, but if my genotype is 0/0 (homozygous for reference)... this line makes no sense for me.

Thanks in advance.

GATK • 11k views
ADD COMMENT
3
Entering edit mode

Indeed it is a very interesting question, my lady.

ADD REPLY
0
Entering edit mode

Dear Iraun I am also facing the same issue. I am using RNA Seq Data. I am confused with the Haplotypecaller output.

After reading your post, I got some Idea but it is still not cleared. should I re run haplotype caller ? or I should exclude these "NON REF" from further analyis. I have used this output bam file for funcotator but I failed to generate a functional annotated file. It would be helpful if you can help me in this issue. Thanks and Regards

ADD REPLY
11
Entering edit mode
9.8 years ago
iraun 6.2k

OK, I found the solution in GATK manual (https://www.broadinstitute.org/gatk/guide/article?id=4017). I'll answer my own question just in case that someone else have the same issue.

"The first thing you'll notice, hopefully, is the <NON_REF> symbolic allele listed in every record's ALT field. This provides us with a way to represent the possibility of having a non-reference allele at this site, and to indicate our confidence either way.

The second thing to look for is the END tag in the INFO field of non-variant block records. This tells you at what position the block ends."

ADD COMMENT
0
Entering edit mode

Thanks for following up!

ADD REPLY
0
Entering edit mode
19 months ago
Moe ▴ 10

0 is reference allele. 1 is first alternative allele. 2 is second alt allele and so on

so if GT is 0/1 = heterozygous ref/alt. 0/0 = homozygous ref/ref. 1/1 = homozygous alt/alt. 0/2 = heterozygous ref/second allele. in this case you would find at least two alleles under alt column.

in your shared example, it should be:

pos: 546636 , GT: G/A pos: 546637, GT: G/G

ADD COMMENT

Login before adding your answer.

Traffic: 2689 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6