I am currently working on some Ion Torrent exome data and I realized that the vcf given by the in house Ion Reporter software is comprising of CNV
, SNV
and MNV
in one file itself. I still do not have my account in the Ion Torrent so I cannot get hold of the raw file and also cannot work with the tool, I have just received the vcfs
(somatic). I tried to separate them with the vcftools but seems that it does not work. Below is my command
vcftools --vcf IonXpress_001_somatic_v5.0.vcf --remove-filtered CNV --out IonXpress_001_somatic_SNVs
Log details
VCFtools - v0.1.9.0
(C) Adam Auton 2009
Parameters as interpreted:
--vcf IonXpress_001_somatic_v5.0.vcf
--out IonXpress_001_somatic_SNVs
--remove-filtered CNV
VCF index is older than VCF file. Will regenerate.
Building new index file.
Scanning Chromosome: chr1
Scanning Chromosome: chr2
Scanning Chromosome: chr3
Scanning Chromosome: chr4
Scanning Chromosome: chr5
Scanning Chromosome: chr6
Scanning Chromosome: chr7
Scanning Chromosome: chr8
Scanning Chromosome: chr9
Scanning Chromosome: chr10
Scanning Chromosome: chr11
Scanning Chromosome: chr12
Scanning Chromosome: chr13
Scanning Chromosome: chr14
Scanning Chromosome: chr15
Scanning Chromosome: chr16
Scanning Chromosome: chr17
Scanning Chromosome: chr18
Scanning Chromosome: chr19
Scanning Chromosome: chr20
Scanning Chromosome: chr21
Scanning Chromosome: chr22
Scanning Chromosome: chrX
Writing Index file.
File contains 7789 entries and 2 individuals.
Applying Required Filters.
Filtering sites by FILTER Status.
After filtering, kept 2 out of 2 Individuals
After filtering, kept 7789 out of a possible 7789 Sites
Run Time = 1.00 seconds
The input file with CNV in the vcf
chr1 68928 . T <CNV> 100.0 PASS PRECISE=FALSE;SVTYPE=CNV;END=10684538;LEN=10615610;NUMTILES=3786;CONFIDENCE=0;PRECISION=1908.24;FUNC=[{'gene':'OR4F5'},{'gene':'LOC729737'},{'gene':'LOC100133331'},{'gene':'RP4-669L17.10'},{'gene':'OR4F16'},{'gene':'OR4F3'},{'gene':'OR4F29'},{'gene':'MIR6723'},{'gene':'LOC100288069'},{'gene':'FAM87B'},{'gene':'LINC00115'},{'gene':'LINC01128'},{'gene':'FAM41C'},{'gene':'LOC100130417'},{'gene':'SAMD11'},{'gene':'NOC2L'},{'gene':'KLHL17'},{'gene':'PLEKHN1'},{'gene':'PERM1'},{'gene':'HES4'},{'gene':'ISG15'},{'gene':'AGRN'},{'gene':'RNF223'},{'gene':'C1orf159'},{'gene':'RP11-465B22.5'},{'gene':'MIR200B'},{'gene':'MIR200A'},{'gene':'MIR429'},{'gene':'TTLL10'},{'gene':'TNFRSF18'},{'gene':'TNFRSF4'},{'gene':'SDF4'},{'gene':'B3GALT6'},{'gene':'FAM132A'},{'gene':'UBE2J2'},{'gene':'SCNN1D'},{'gene':'ACAP3'},{'gene':'MIR6726'},{'gene':'PUSL1'},{'gene':'CPSF3L'},{'gene':'MIR6727'},{'gene':'GLTPD1'},{'gene':'TAS1R3'},{'gene':'DVL1'},{'gene':'MIR6808'},{'gene':'MXRA8'},{'gene':'AURKAIP1'},{'gene':'CCNL2'},{'gene':'LOC148413'},{'gene':'MRPL20'},{'gene':'ANKRD65'},{'gene':'TMEM88B'},{'gene':'VWA1'},{'gene':'ATAD3C'},{'gene':'ATAD3B'},{'gene':'ATAD3A'},{'gene':'TMEM240'},{'gene':'SSU72'},{'gene':'C1orf233'},{'gene':'MIB2'},{'gene':'MMP23B'},{'gene':'MMP23A'},{'gene':'CDK11B'},{'gene':'SLC35E2B'},{'gene':'CDK11A'},{'gene':'SLC35E2'},{'gene':'NADK'},{'gene':'GNB1'},{'gene':'CALML6'},{'gene':'TMEM52'},{'gene':'KIAA1751'},{'gene':'GABRD'},{'gene':'PRKCZ'},{'gene':'C1orf86'},{'gene':'SKI'},{'gene':'MORN1'},{'gene':'LOC100129534'},{'gene':'RER1'},{'gene':'PEX10'},{'gene':'PLCH2'},{'gene':'PANK4'},{'gene':'HES5'},{'gene':'LOC115110'},{'gene':'LOC100133445'},{'gene':'TNFRSF14'},{'gene':'FAM213B'},{'gene':'MMEL1'},{'gene':'TTC34'},{'gene':'ACTRT2'},{'gene':'LINC00982'},{'gene':'PRDM16'},{'gene':'MIR4251'},{'gene':'ARHGEF16'},{'gene':'MEGF6'},{'gene':'MIR551A'},{'gene':'TPRG1L'},{'gene':'WRAP73'},{'gene':'TP73'},{'gene':'TP73-AS1'},{'gene':'CCDC27'},{'gene':'SMIM1'},{'gene':'LRRC47'},{'gene':'CEP104'},{'gene':'DFFB'},{'gene':'C1orf174'},{'gene':'LINC01134'},{'gene':'RP13-614K11.1'},{'gene':'RP5-1166F10.1'},{'gene':'AJAP1'},{'gene':'MIR4417'},{'gene':'MIR4689'},{'gene':'NPHP4'},{'gene':'KCNAB2'},{'gene':'CHD5'},{'gene':'RPL22'},{'gene':'RNF207'},{'gene':'ICMT'},{'gene':'LINC00337'},{'gene':'HES3'},{'gene':'GPR153'},{'gene':'ACOT7'},{'gene':'HES2'},{'gene':'ESPN'},{'gene':'MIR4252'},{'gene':'TNFRSF25'},{'gene':'PLEKHG5'},{'gene':'NOL9'},{'gene':'TAS1R1'},{'gene':'ZBTB48'},{'gene':'KLHL21'},{'gene':'PHF13'},{'gene':'THAP3'},{'gene':'DNAJC11'},{'gene':'LOC100505887'},{'gene':'CAMTA1'},{'gene':'VAMP3'},{'gene':'PER3'},{'gene':'UTS2'},{'gene':'TNFRSF9'},{'gene':'PARK7'},{'gene':'ERRFI1'},{'gene':'SLC45A1'},{'gene':'RERE'},{'gene':'ENO1'},{'gene':'MIR6728'},{'gene':'ENO1-AS1'},{'gene':'CA6'},{'gene':'SLC2A7'},{'gene':'SLC2A5'},{'gene':'GPR157'},{'gene':'MIR34A'},{'gene':'H6PD'},{'gene':'SPSB1'},{'gene':'LOC100506022'},{'gene':'SLC25A33'},{'gene':'TMEM201'},{'gene':'PIK3CD'},{'gene':'C1orf200'},{'gene':'CLSTN1'},{'gene':'CTNNBIP1'},{'gene':'LZIC'},{'gene':'NMNAT1'},{'gene':'RBP7'},{'gene':'UBE4B'},{'gene':'KIF1B'},{'gene':'PGD'},{'gene':'APITD1-CORT'},{'gene':'APITD1'},{'gene':'CORT'},{'gene':'DFFA'},{'gene':'PEX14'}] GT:GQ:CN ./.:0:2 ./.:.:.
Just showed for one CNV
, seems that it contains a lot of information but how can I separate it from the VCF
and keep my vcf only with SNVs
and INDELS
, I do not want to use awk
and grep
here as I have to then recreate the vcf format, any tool that can do this and give me output vcf
? vcftools
does not work here it seems. Any help will be appreciated. Thanks
from http://vcftools.sourceforge.net/man_latest.html
there is no such FILTER named "CNV" in your VCF file in the FILTER=PASS column (but there is a Symbolic ALT allele named CNV)
Yes my mistake. I have to filter for column 5 of for INFO field. I will check for other options now. Thanks for pointing out the mistake.