Hi community!,
I'm annotating variants with the VEP software and I'm finding some unexpected transcript data of the type:
- NM_014938.4_dupl16
- NM_001170637.2_dupl3
1 206516261 . C T 47 PASS CSQ=T|non_coding_transcript_exon_variant|MODIFIER|SRGAP2|23380|Transcript|NM_001170637.2_dupl3|mRNA|1/20||NM_001170637.2_dupl3.1:n.65C>T||65|||||||1||SNV|EntrezGene|||||||||||C|C||||||||||||||||||||||||||||||||||||||||||||||||||||||||||1:206516261-206516261|0.4996565||||,
T|missense_variant|MODERATE|SRGAP2|23380|Transcript|NM_001170637.3|protein_coding|1/20||NM_001170637.3:c.65C>T|NP_001164108.1:p.Arg289Trp|864|865|289|R/W|Cgg/Tgg|||1||SNV|EntrezGene||||||NP_001164108.1|||||C|C|OK|||||||||||||||||||||||||||||||||||0.63580||||T|T||||||||||2|||||||1:206516261-206516261|0.4996565||||,
T|missense_variant|MODERATE|SRGAP2|23380|Transcript|NM_001300952.1|protein_coding|1/18||NM_001300952.1:c.65C>T|NP_001287881.1:p.Arg289Trp|864|865|289|R/W|Cgg/Tgg|||1||SNV|EntrezGene||||||NP_001287881.1|||||C|C|OK|||||||||||||||||||||||||||||||||||0.63580||||T|T||||||||||2|||||||1:206516261-206516261|0.4996565||||,
T|non_coding_transcript_exon_variant|MODIFIER|SRGAP2|23380|Transcript|NM_015326.3_dupl3|mRNA|1/20||NM_015326.3_dupl3.1:n.65C>T||65|||||||1||SNV|EntrezGene||YES|||||||||C|C||||||||||||||||||||||||||||||||||||||||||||||||||||||||||1:206516261-206516261|0.4996565||||,
T|missense_variant|MODERATE|SRGAP2|23380|Transcript|NM_015326.4|protein_coding|1/20||NM_015326.4:c.65C>T|NP_056141.2:p.Arg289Trp|864|865|289|R/W|Cgg/Tgg|||1||SNV|EntrezGene||YES||||NP_056141.2|||||C|C|OK|||||||||||||||||||||||||||||||||||0.63580||||T|T||||||||||2|||||||1:206516261-206516261|0.4996565|||| GT:DP:VD:AD:AF:RD:ALD 0/1:9:3:6,3:0.3333:6,0:3,0
Searching on the VEP webpage or in the internet I can't find any reference to this kind of "dupl" suffix. Has anyone faced this? I don't know if they are alternatives of the transcript or explain why they are not transcripts on its own.
Thanks in advance!
Cristian.
Edit: Added example of variant with the vep annotation of dup (NM_015326.3_dupl3)
Edit2: Using VEP ensembl version 91.1 with cache v91
could you post the variants (VCF records) that cause this annotation?
Also, which column of your VEP output are you finding this notation in?
Hi Emily, it's the parameter that references the transcript, the "Feature" column (I'm actually outputting in a VCF format).
Thanks, will try to trace.
Are you using GRCh37?
Yes, version 91 of GRCh37
I think NM_015326.3_dupl3.1 and other entries mentioned in OP are feature (transcript) names in that build. Variation reporter for NC_000001.10:g.206516261C>T for GRCh37.p13 (AR-105, dbSNP v 149): doesn' list coding variant at position 65, instead at 322 (NM_015326.4:c.322C>T) and has only one annotation instead of 2, which is mentioned above.
I supposed that is something like that. What intrigues us is why name it like a "duplXX". We thought that they may be duplicates from another transcripts or reference transcripts with duplicate exons, but watching that "dupl16" was really strange.