Hi, I have recently been introduced to bioinformatics, Variant Calling and the necessary analysis for it. I have obtained several VCF files from FreeBayes and I wanted to run them through snpEff too. This way I could annotate de variants between two genomes of the genus Shewanella. Sadly I have not been able to do this since I keep getting the same error over and over. Here is the code and the running process with the error included:
(base) binso@LAPTOP-P73O7IPS:~/Alignment/VariantCalling$ java -jar ../../miniconda3/pkgs/snpeff-4.3.1t 3/share/snpeff-4.3.1t-3/snpEff.jar Shewanella_putrefaciens_cn_32 ../../miniconda3/pkgs/snpeff-4.3.1t-3/share/snpeff-4.3.1t-3/snpEff.config -v -s VC_SPt_ST2-3D_6.vcf>VC_snpEff.vcf
00:00:00 SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00 Command: 'ann'
00:00:00 Reading configuration file 'snpEff.config'. Genome: 'Shewanella_putrefaciens_cn_32'
00:00:00 Reading config file: /home/binso/Alignment/VariantCalling/snpEff.config
00:00:00 Reading config file: /home/binso/miniconda3/pkgs/snpeff-4.3.1t-3/share/snpeff-4.3.1t-3/snpEff.config
00:00:00 done
00:00:00 Reading database for genome version 'Shewanella_putrefaciens_cn_32' from file'/home/binso/miniconda3/pkgs/snpeff4.3.1t3/share/snpeff4.3.1t3/./data/
Shewanella_putrefaciens_cn_32/snpEffectPredictor.bin' (this might take a while) 00:00:01 done
00:00:01 Loading Motifs and PWMs 00:00:01 Building interval forest 00:00:02 done.
00:00:02 Genome stats :
#-----------------------------------------------
# Genome name : 'Shewanella_putrefaciens_cn_32'
# Genome version : 'Shewanella_putrefaciens_cn_32'
# Genome ID: 'Shewanella_putrefaciens_cn_32[0]'
# Has protein coding info: true
# Has Tr. Support Level info : true
# Genes : 4331
# Protein coding gene: 4171
#-----------------------------------------------
# Transcripts: 4331
# Avg. transcripts per gene: 1.00
# TSL transcripts: 0
#-----------------------------------------------
# Checked transcripts:
# AA sequences : 3972 ( 95.23% )
# DNA sequences : 0 ( 0.00% )
#-----------------------------------------------
# Protein coding transcripts : 4171
#Length errors : 182 ( 4.36% )
#STOP codons in CDS errors : 29 ( 0.70% )
#START codon errors : 535 ( 12.83% )
#STOP codon warnings : 17 ( 0.41% )
#UTR sequences : 0 ( 0.00% )
#Total Errors : 540 ( 12.95% )
# WARNING: No protein coding transcript has UTR
#-----------------------------------------------
# Cds : 3972
#Exons: 4331
# Exons with sequence : 4331
# Exons without sequence : 0
# Avg. exons per transcript : 1.00
#-----------------------------------------------
# Number of chromosomes : 1
# Chromosomes : Format 'chromo_name size codon_table'
# 'Chromosome' 4659220 Standard
#-----------------------------------------------
00:00:02 Predicting variants
VcfFileIterator.parseVcfLine(132): Fatal error readingfile'../../miniconda3/pkgs/snpeff4.3.1t3/share/snpeff4.3.1t3/snpEff.config' (line: 17):
data.dir = ./data/ java.lang.RuntimeException: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
data.dir = ./data/
at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:133)
at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:184)
at org.snpeff.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:57)
at org.snpeff.fileIterator.FileIterator.hasNext(FileIterator.java:123)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotateVcf(SnpEffCmdEff.java:467)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:142)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:1029)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:984)
at org.snpeff.SnpEff.run(SnpEff.java:1183)
at org.snpeff.SnpEff.main(SnpEff.java:162)
Caused by: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?).
data.dir = ./data/
at org.snpeff.vcf.VcfEntry.parse(VcfEntry.java:1007)
at org.snpeff.vcf.VcfEntry.<init>(VcfEntry.java:219)
at org.snpeff.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:130)
... 9 more
00:00:02 Logging
00:00:03 Checking for updates...
00:00:05 Done.
I'm not sure if this is due to the snpEff.config file or my VCF file. Any possible tip or recommendation would be highly appreciated.
Thank you for replying, I have tried that with the following command and it seems to work alright:
It keeps going for a while so I just pasted some of the first lines
can you please try _without_
--header-only
...It prints the whole input file like this: