Dear all,
I'm struggling in finding how to obtain information about COSMIC database after I annotate a vcf using VEP. The command that I used is:
vep -i Mutect2_unfiltered_10643_vs_2434.vcf.gz
-o Mutect2_unfiltered_10643_vs_2434_VEP.ann.vcf
--assembly GRCh38
--species homo_sapiens
--offline --cache --cache_version 99
--dir_cache /.vep --everything --filter_common
--fork 4 --format vcf --per_gene --stats_file Mutect2_unfiltered_10643_vs_2434_VEP.summary.html
--total_length --vcf
I get back a vcf annotated file, but I'm not able to find any COSMIC information. Maybe there is a way to filter only for COSMIC?
Thank you in advance,
Youssef
Thank you Emily, I'll try and let you know how it worked! :)
Dear Emily, I filtered my vcf file using this command:
I obtain the result in the table, however I'm still confused on how to interpret these vcf file.
**filtering_status=These calls have been filtered by FilterMutectCalls to label false positives with a list of failed filters and true positive s with PASS.
**normal_sample=2682
**source=FilterMutectCalls
**source=Mutect2
**tumor_sample=10643
**VEP="v99" time="2021-02-10 15:29:22" cache="/hpcshare/genomics/sarek_analyses/work/b6/54109e400c87375cc7a5d169ec2bc2/vep_cache/homo_sapiens/99_GRCh38" ensembl-funcgen=99.0832337 ensembl=99.d3e7d31 ensembl-variation=99.642e1cd ensembl-io=99.441b05b 1000genomes="phase3" COSMIC="90" ClinVar="201909" ESP="V2-SSA137" HGMD-PUBLIC="20184" assembly="GRCh38.p13" dbSNP="153" gencode="GENCODE 33" genebuild="2014-07" gnomAD="r2.1" polyphen="2.2.2" regbuild="1.0" sift="sift5.2.2"
**INFO=-ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|FLAGS|VARIANT_CLASS|SYMBOL_SOURCE|HGNC_ID|CANONICAL|MANE|TSL|APPRIS|CCDS|ENSP|SWISSPROT|TREMBL|UNIPARC|GENE_PHENO|SIFT|PolyPhen|DOMAINS|miRNA|HGVS_OFFSET|AF|AFR_AF|AMR_AF|EAS_AF|EUR_AF|SAS_AF|AA_AF|EA_AF|gnomAD_AF|gnomAD_AFR_AF|gnomAD_AMR_AF|gnomAD_ASJ_AF|gnomAD_EAS_AF|gnomAD_FIN_AF|gnomAD_NFE_AF|gnomAD_OTH_AF|gnomAD_SAS_AF|MAX_AF|MAX_AF_POPS|CLIN_SIG|SOMATIC|PHENO|PUBMED|MOTIF_NAME|MOTIF_POS|HIGH_INF_POS|MOTIF_SCORE_CHANGE"-
*CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 10643 2682
chr1 16534 . C T . map_qual;normal_artifact;panel_of_normals AS_FilterStatus=map_qual;AS_SB_TABLE=38,71|1,12;DP=126;ECNT=2;GERMQ=29;MBQ=32,30;MFRL=305,438;MMQ=23,23;MPOS=29;NALOD=-9.266e+00;NLOD=7.26;PON;POPAF=0.305;TLOD=9.60;CSQ=T|downstream_gene_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000450305|transcribed_unprocessed_pseudogene||||||||||rs15642|2864|1||SNV|HGNC|HGNC:37102||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|DDX11L1|ENSG00000223972|Transcript|ENST00000456328|processed_transcript||||||||||rs15642|2125|1||SNV|HGNC|HGNC:37102|YES||1|||||||||||||||||||||||||||||||||||||||,T|intron_variant&non_coding_transcript_variant|MODIFIER|WASH7P|ENSG00000227232|Transcript|ENST00000488147|unprocessed_pseudogene||8/10|ENST00000488147.1:n.1067+73G>A|||||||rs15642||-1||SNV|HGNC|HGNC:38034|YES|||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|MIR6859-1|ENSG00000278267|Transcript|ENST00000619216|miRNA||||||||||rs15642|835|1||SNV|HGNC|HGNC:50039|YES|||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000344266|CTCF_binding_site||||||||||rs15642||||SNV|||||||||||||||||||||||||||||||||||||||||||| GT:AD:AF:DP:F1R2:F2R1:SB 0/1:8,4:0.416:12:5,3:3,1:3,5,0,4 0/0:101,9:0.090:110:53,5:46,3:35,66,1,8
(I changed #/< with */- for formatting issues)
I don't know if this has been changed. But due to the IP reason of COSMIC, VEP had not been able to check alleles. It was pure loci match without allele matching. If the later is what you want, probably better to write your own annotator