Is this variant probably an artifact?
0
0
Entering edit mode
6.7 years ago
Sharon ▴ 610

Hi Everyone

I have an interesting variant (very rare) found in affected and not found in unaffected and non-carriers. The variant is in a gene highly related to a very rare phenotype.
The variant is annotated by annovar as GnomAD = 0, and ExAC = .. I looked it in ExAC , it is not found. However when I look it up in gnomAD it is found as filtered variant as AC0, so not high confidence genotype. As follows:

Filtered RF and AC0 
Allele Count  
0  
Allele Number
239250
Allele Frequency
0

In my samples the variant is found with DP between 15 and 21 in my samples and all the other scores like FS and MPQ, ..etc seem okay. As follows:

 GT:AD:DP:GQ:PL  0/1:7,14:21:99:383,0,13
 MQ=58.73;MQRankSum=0.871;QD=13.15;ReadPosRankSum=-0.989;SOR=1.630;
 ABHet=0.505;ABHom=1.00;AC=1;AF=0.00;AN=2;BaseQRankSum=1.875;DP=1987;Dels=0.00;FS=3.500

Should I take the variant into account or probably assume it is an artifact. Or it could be interesting but in a hard region, so not found in ExAC At all, and found in gnomAD with low confidence? Thanks

VCF Exome Sequencing • 2.8k views
ADD COMMENT
0
Entering edit mode

did you look at the bam file ?

ADD REPLY
0
Entering edit mode

No, what should I find to assure or not?

ADD REPLY
0
Entering edit mode

In the BAM alignments (on IGV), you can gauge the general alignment depth of coverage around the variant, whether or not any of these alignments have low MAPQ or mis-matched mates, and whether it's being called in a repeat region, for example.

ADD REPLY
0
Entering edit mode

So, if it is in a repeat region, or everything around it has high coverage, then discard? if everything around has low coverage then probably it is a hard region, so take into account? Is this right? Thanks :)

ADD REPLY
1
Entering edit mode

A repeat region just makes it more difficult for the aligner to correctly position reads, which should reduce their MAPQ and, ultimately, the reliability of the variant call. Also, if it's a region that exhibits high sequence similarity to another region (or regions) of the genome, then this can also bias the variant call.

There are many genomic regions that remain difficult, if not impossible, to reliably sequence with current short-read technology.

I'm guessing that your region is indeed a repeat of some sort? If you look it up on UCSC Genome Browser, you can see if it's repeat-masked or not.

ADD REPLY
0
Entering edit mode

Thanks Kevin so much. I can't tell, if it is repeat or not? And if it is for somereason hard and unreliable call, we should discard or keep? here it is: https://ibb.co/j2aJYS

ADD REPLY
1
Entering edit mode

Hmmm... all of the bases are N masked but that UCSC session is for hg38. Are your variant positions hg38 or hg19?

ADD REPLY
0
Entering edit mode

Oh, no. I added one for hg19: https://ibb.co/b1gLxn

ADD REPLY
2
Entering edit mode

If your variant is the middle base in that screenshot, i.e., C, then it's on a highly conserved base. High conservation is the best single predictor of pathogenicity.

Aside from that, the broader region appears to have a few short repeats of G and generally a high GC content, which would affect DNA melting and coverage over the region, but not necessarily affect the alignment that much. Here's my current UCSC session: https://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg19&lastVi...

If you could additionally look at the alignments in IGV, that would help. If they are all flagged with a particular colour, then alarm bells should ring.

ADD REPLY
1
Entering edit mode

Thanks Kevin so much. You always teach me something new. Much appreciated.

ADD REPLY
0
Entering edit mode

Thanks Pierre so much

ADD REPLY
2
Entering edit mode

I think that Pierre became busy! I was twiddling my thumbs...

ADD REPLY

Login before adding your answer.

Traffic: 1852 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6