I am working on some WGS annotation with VEP, I am outputting MutationAssessor
and Polyphen2
values from dbNSFP plugin (among others) and I've noticed it gives me more than a output for several alleles. As example, VEP output line:
chr1 1535637 . C T 534.27 PASS AC=1;AF=0.5;AN=2;BaseQRankSum=-1.089;ClippingRankSum=0.876;DP=27;ExcessHet=3.0103;FS=13.64;MQ=60;MQ0=0;MQRankSum=1.83;QD=19.79;ReadPosRankSum=0.451;SOR=0.61;VQSLOD=15.92;culprit=MQ;CSQ=T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000339113|protein_coding||||||||||rs187039783|952|1|cds_start_NF|SNV|HGNC|HGNC:25567||2|||ENSP00000339421||H0Y2W2|UPI000059CF52|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000378733|protein_coding|3/4||ENST00000378733.8:c.325G>A|ENSP00000368007.4:p.Gly109Ser|336|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186|YES|2|P1|CCDS44040.1|ENSP00000368007|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|hmmpanther:PTHR28666&Pfam_domain:PF15207||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378755|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567|YES|2||CCDS31.1|ENSP00000368030|Q9NVI7||UPI000013D456|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378756|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567||1|P1|CCDS53259.1|ENSP00000368031|Q9NVI7||UPI000006CE90|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000425828|protein_coding|4/5||ENST00000425828.1:c.325G>A|ENSP00000400311.1:p.Gly109Ser|368|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186||2|P1|CCDS44040.1|ENSP00000400311|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|Pfam_domain:PF15207&hmmpanther:PTHR28666||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000536055|protein_coding||||||||||rs187039783|950|1||SNV|HGNC|HGNC:25567||2||CCDS53260.1|ENSP00000439290|Q9NVI7||UPI000048B049|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|TMEM240|ENSG00000205090|Transcript|ENST00000624426|protein_coding||||||||||rs187039783|3419|-1||SNV|HGNC|HGNC:25186||3|||ENSP00000485135||A0A096LNN7|UPI0004F2365B|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000000199|open_chromatin_region||||||||||rs187039783||||SNV||||||||||||||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| GT:AB:AD:DP:GQ:PL 0/1:0.3:8,19:27:99:568,0,197
as you can see for the second allele the MutationAssessor_pred
=L&L
, MutationAssessor_score
=0.805&0.805
, at the same manner Polyphen2_HDIV_pred
=D&D
and the same for all the rest of MutationAssessor
and Polyphen2
fields
Does anybody knows why?
Thank you very much in advance for any help!
Thank you so much @Ben_Ensembl for your answer. I'm not getting which transcripts you're referring to, in the entry with multiple values of mutationassessor, humdiv etc. shouldn't the transcript refers only to
ENSP00000368007.4
?No problem- very happy to help. I'd need to see the options you included in your command-line query and the full VEP output for this variant to give a more detailed explanation but this variant falls within the CDS of 2 different transcripts of the TMEM240 gene (TMEM240-201 and TMEM240-202): http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000205090;r=1:1535207-1540453;t=ENST00000425828;tl=rdX3s6Fq5SiFCEcH-149700
I suspect that the two values for MutationAssessor_score and Polyphen2_HDIV_pred refer to predictions for the effects predicted on the proteins encoded by these two transcripts.
Best wishes
Ben Ensembl Helpdesk
Thank you gain Ben. Here the code:
Hi cocchi.e89,
The dbNSFP plugin only returns the data for a specified column in the dbNSFP file. For example, when you ask for a given variant the value for MutationAssessor_score which is stored in the MutationAssessor_score column of the dbNSFP file.
The readme file describes which values are expected for a certain column. If you search for MutationAssessor_score in: https://drive.google.com/file/d/1HPcIEE1USiwCPzRFFwZhTZagka1uyUV1/view
you get: (MutationAssessor functional impact combined score (MAori). The score ranges from -5.17 to 6.49 in dbNSFP. Multiple entries are separated by ";", corresponding to Uniprot_entry.)
VEP is replacing ; with & because ; is a special character in the VEP output.
Best wishes
Ben Ensembl Helpdesk