Hi,
How can I annotate or compare the variants I have got in my vcf file that include the gene name to known mutations in the same gene but different chr location for instance from OMIM? For example I have searched manually on OMIM for MED25 gene, this is the output:
Gene-Phenotype Relationships Location Phenotype Phenotype MIM number Inheritance Phenotype mapping key 19q13.33 ?Charcot-Marie-Tooth disease, type 2B2 605589 AR 3 Basel-Vanagait-Smirin-Yosef syndrome 616449 AR 3
And my file is a multi vcf :
chr19 50321587 rs114843375;138201 A G 6630.07 PASS AC=10;A_F=0.052;AN=192;BaseQRankSum=1.74;ClippingRankSum=0.069;DP=4747;FS=1.147;GQ_MEAN=141.68;GQ_STDDEV=124.09;InbreedingCoeff=0.1560;MLEAC=10;MLEA_F=0.052;MQ=57.41;MQ0=0;MQRankSum=0.789;NCC=0;POSITIVE_TRAIN_SITE;QD=15.07;ReadPosRankSum=0.548;SOR=0.593;VQSLOD=5.04;culprit=MQ;EFF=UTR_5_PRIME(MODIFIER||12||MED25|protein_coding|CODING|NM_030973.3|1);AF=4.645e-03;COMMON=1;AF_ESP=0.0112;AF_EXAC=0.00869;AF_TGP=0.0102;ALLELEID=141904;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000019.9:g.50321587A>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=MED25:81857;MC=SO:0001623|5_prime_UTR_variant;ORIGIN=1;RS=114843375 GT:AD:DP:GQ:PL 0/0:42,0:42:99:0,103,1320 0/0:48,0:48:99:0,120,1800
The output I want is to add the Phenotype MIM number in the end of the row:
chr19 50321587 rs114843375;138201 A G 6630.07 PASS AC=10;A_F=0.052;AN=192;BaseQRankSum=1.74;ClippingRankSum=0.069;DP=4747;FS=1.147;GQ_MEAN=141.68;GQ_STDDEV=124.09;InbreedingCoeff=0.1560;MLEAC=10;MLEA_F=0.052;MQ=57.41;MQ0=0;MQRankSum=0.789;NCC=0;POSITIVE_TRAIN_SITE;QD=15.07;ReadPosRankSum=0.548;SOR=0.593;VQSLOD=5.04;culprit=MQ;EFF=UTR_5_PRIME(MODIFIER||12||MED25|protein_coding|CODING|NM_030973.3|1);AF=4.645e-03;COMMON=1;AF_ESP=0.0112;AF_EXAC=0.00869;AF_TGP=0.0102;ALLELEID=141904;CLNDISDB=MedGen:CN169374;CLNDN=not_specified;CLNHGVS=NC_000019.9:g.50321587A>G;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=MED25:81857;MC=SO:0001623|5_prime_UTR_variant;ORIGIN=1;RS=114843375 GT:AD:DP:GQ:PL 0/0:42,0:42:99:0,103,1320 0/0:48,0:48:99:0,120,1800 605589, 616449
How can I approach this?
Thank you,
Pola
Hi @Pierre Lindenbaum,
I have generated a file through http://www.ensembl.org/biomart (like you suggested)
The file is :
My file:
I wanted to get this output:
I have tried to run the following command:
This is the output that I get:
I am missing this omimID 605589 Can you please help me to overcome this? Thank you very much, Pola