VEP MutationAssessor output
1
0
Entering edit mode
4.7 years ago
cocchi.e89 ▴ 290

I am working on some WGS annotation with VEP, I am outputting MutationAssessor and Polyphen2 values from dbNSFP plugin (among others) and I've noticed it gives me more than a output for several alleles. As example, VEP output line: chr1 1535637 . C T 534.27 PASS AC=1;AF=0.5;AN=2;BaseQRankSum=-1.089;ClippingRankSum=0.876;DP=27;ExcessHet=3.0103;FS=13.64;MQ=60;MQ0=0;MQRankSum=1.83;QD=19.79;ReadPosRankSum=0.451;SOR=0.61;VQSLOD=15.92;culprit=MQ;CSQ=T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000339113|protein_coding||||||||||rs187039783|952|1|cds_start_NF|SNV|HGNC|HGNC:25567||2|||ENSP00000339421||H0Y2W2|UPI000059CF52|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000378733|protein_coding|3/4||ENST00000378733.8:c.325G>A|ENSP00000368007.4:p.Gly109Ser|336|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186|YES|2|P1|CCDS44040.1|ENSP00000368007|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|hmmpanther:PTHR28666&Pfam_domain:PF15207||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378755|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567|YES|2||CCDS31.1|ENSP00000368030|Q9NVI7||UPI000013D456|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378756|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567||1|P1|CCDS53259.1|ENSP00000368031|Q9NVI7||UPI000006CE90|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000425828|protein_coding|4/5||ENST00000425828.1:c.325G>A|ENSP00000400311.1:p.Gly109Ser|368|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186||2|P1|CCDS44040.1|ENSP00000400311|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|Pfam_domain:PF15207&hmmpanther:PTHR28666||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000536055|protein_coding||||||||||rs187039783|950|1||SNV|HGNC|HGNC:25567||2||CCDS53260.1|ENSP00000439290|Q9NVI7||UPI000048B049|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|TMEM240|ENSG00000205090|Transcript|ENST00000624426|protein_coding||||||||||rs187039783|3419|-1||SNV|HGNC|HGNC:25186||3|||ENSP00000485135||A0A096LNN7|UPI0004F2365B|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000000199|open_chromatin_region||||||||||rs187039783||||SNV||||||||||||||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| GT:AB:AD:DP:GQ:PL 0/1:0.3:8,19:27:99:568,0,197 as you can see for the second allele the MutationAssessor_pred=L&L, MutationAssessor_score=0.805&0.805, at the same manner Polyphen2_HDIV_pred=D&D and the same for all the rest of MutationAssessor and Polyphen2 fields

Does anybody knows why?

Thank you very much in advance for any help!

vep MutationAssessor variant human • 1.4k views
ADD COMMENT
0
Entering edit mode
4.7 years ago
Ben Moore ★ 2.4k

Hi cocchi.e89,

It would be great if you could also share the command line options that you have used, but I think the two MutationAssessor_score values and Polyphen2_HDIV_pred values refer to two specific transcripts of the TMEM240 gene, where this variant is a missense variant

You can see this from the query I ran with your example variant through the VEP web tool: http://www.ensembl.org/Homo_sapiens/Tools/VEP/Results?tl=HApTReGZ2Rrj3bEe-148783

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
0
Entering edit mode

Thank you so much @Ben_Ensembl for your answer. I'm not getting which transcripts you're referring to, in the entry with multiple values of mutationassessor, humdiv etc. shouldn't the transcript refers only to ENSP00000368007.4?

ADD REPLY
0
Entering edit mode

No problem- very happy to help. I'd need to see the options you included in your command-line query and the full VEP output for this variant to give a more detailed explanation but this variant falls within the CDS of 2 different transcripts of the TMEM240 gene (TMEM240-201 and TMEM240-202): http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000205090;r=1:1535207-1540453;t=ENST00000425828;tl=rdX3s6Fq5SiFCEcH-149700

I suspect that the two values for MutationAssessor_score and Polyphen2_HDIV_pred refer to predictions for the effects predicted on the proteins encoded by these two transcripts.

Best wishes

Ben Ensembl Helpdesk

ADD REPLY
0
Entering edit mode

Thank you gain Ben. Here the code:

tar xfz /sbgenomics/Projects/4fa81929-f890-4a8f-b6e8-bfb73afd7229/annotation_files/homo_sapiens_vep_90_GRCh38.tar.gz -C /opt/ensembl-vep/ && perl -I /root/.vep/Plugins/ /opt/ensembl-vep/vep --offline --dir /opt/ensembl-vep --buffer_size 20000 --plugin dbNSFP,/sbgenomics/Projects/4fa81929-f890-4a8f-b6e8-bfb73afd7229/annotation_files/dbNSFPv4.0a_custombuild.gz,LRT_score,LRT_pred,GERP++_RS,MutationTaster_pred,MutationTaster_score,MutationAssessor_pred,MutationAssessor_score,FATHMM_score,FATHMM_pred,1000Gp3_AC,1000Gp3_AF,1000Gp3_AFR_AC,1000Gp3_AFR_AF,1000Gp3_EUR_AC,1000Gp3_EUR_AF,1000Gp3_AMR_AC,1000Gp3_AMR_AF,1000Gp3_EAS_AC,1000Gp3_EAS_AF,1000Gp3_SAS_AC,1000Gp3_SAS_AF,UK10K_AF,ESP6500_AA_AF,ESP6500_EA_AF,gnomAD_exomes_POPMAX_AF,gnomAD_exomes_POPMAX_nhomalt,gnomAD_genomes_POPMAX_AF,gnomAD_genomes_POPMAX_nhomalt,GTEx_V7_gene,GTEx_V7_tissue,Geuvadis_eQTL_target_gene,Polyphen2_HDIV_score,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_pred,ExAC_AC,ExAC_AF,ExAC_Adj_AF,ExAC_AFR_AC,ExAC_AFR_AF,ExAC_AMR_AC,ExAC_AMR_AF,ExAC_EAS_AC,ExAC_EAS_AF,ExAC_FIN_AC,ExAC_FIN_AF,ExAC_NFE_AC,ExAC_NFE_AF,ExAC_SAS_AC,ExAC_SAS_AF,REVEL_score,REVEL_rankscore,clinvar_id,clinvar_clnsig,clinvar_trait,clinvar_review,clinvar_hgvs,clinvar_var_source,clinvar_MedGen_id,clinvar_OMIM_id,clinvar_Orphanet_id,CADD_phred,ExAC_Adj_AC --plugin dbscSNV,/sbgenomics/Projects/4fa81929-f890-4a8f-b6e8-bfb73afd7229/annotation_files/dbscSNV.txt.gz --fasta /sbgenomics/Projects/4fa81929-f890-4a8f-b6e8-bfb73afd7229/annotation_files/GRCh38_full_analysis_set_plus_decoy_hla.phiX174.fa --fork 8 --input_file /sbgenomics/workspaces/4fa81929-f890-4a8f-b6e8-bfb73afd7229/tasks/781f7770-0059-4623-886d-2de6b0fd6673/VT_preprocessing/RPTEC.recalibrated.haplotypeCalls.processed.vcf.gz --output_file RPTEC.recalibrated.haplotypeCalls.processed.vep.vcf --everything  --no_stats --keep_csq --hgvs  --vcf --format vcf
ADD REPLY
0
Entering edit mode

Hi cocchi.e89,

The dbNSFP plugin only returns the data for a specified column in the dbNSFP file. For example, when you ask for a given variant the value for MutationAssessor_score which is stored in the MutationAssessor_score column of the dbNSFP file.

The readme file describes which values are expected for a certain column. If you search for MutationAssessor_score in: https://drive.google.com/file/d/1HPcIEE1USiwCPzRFFwZhTZagka1uyUV1/view

you get: (MutationAssessor functional impact combined score (MAori). The score ranges from -5.17 to 6.49 in dbNSFP. Multiple entries are separated by ";", corresponding to Uniprot_entry.)

VEP is replacing ; with & because ; is a special character in the VEP output.

Best wishes

Ben Ensembl Helpdesk

ADD REPLY

Login before adding your answer.

Traffic: 2822 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6