Dear All,
I have been provided many vcf files generated using GATK. I am trying to filter the VCF file using filter option from SnpSift tools.
Before filtering the vcf file, I performed vcf validation using vcf_validator. I received following lines from it
INFO field at chr1:25639411 .. Could not validate the float [NaN]
INFO field at chr5:84948114 .. Could not validate the float [NaN]
.... INFO field at chrX:119015903 .. Could not validate the float [NaN]
I learnt from others that this is not a big issue for processing the vcf files for the downstream analysis. Do I need to perform any steps to rectify it?
I am filtering based on following conditions
- Filter : PASS
- Genotype: NonRef (0/1, 1/1, 1/2..so on)
command: cat inputvcffile.vcf.gz | java -jar SnpSift.jar filter "FILTER = 'PASS'" | java -jar SnpSift.jar filter "isVariant( GEN[0] ) | isHet( GEN[0] )" | bgzip -c > outputvcf.vcf.gz
Error:
VcfFileIterator.parseVcfLine(114): Fatal error reading file '-' (line: 1): ^_<8B>^H^D^@^@^@^@^@<ff>^F^@BC^B^@06<ed>}<eb>S#ɱ<ef>g<fc>W(<ec>^O<d7>ESCGh<ea><fd>И{C^C^LK<98>^A^F<d8><f1><f5>u8^T^Z^P<a0><b3>B<u+0092><d8><d9><f1>_^?2<eb>Ѫ<96><aa><d5> Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?). ^_<8B>^H^D^@^@^@^@^@<ff>^F^@BC^B^@06<ed>}<eb>S#ɱ<ef>g<fc>W(<ec>^O<d7>ESCGh<ea><fd>И{C^C^LK<98>^A^F<d8><f1><f5>u8^T^Z^P<a0><b3>B<u+0092><d8><d9><f1>_^?2<eb>Ѫ<96><aa><d5> at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:115) at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:166) at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.readNext(VcfFileIterator.java:56) at ca.mcgill.mcb.pcingola.fileIterator.FileIterator.hasNext(FileIterator.java:123) at ca.mcgill.mcb.pcingola.snpSift.SnpSiftCmdFilter.run(SnpSiftCmdFilter.java:295) at ca.mcgill.mcb.pcingola.snpSift.SnpSiftCmdFilter.run(SnpSiftCmdFilter.java:269) at ca.mcgill.mcb.pcingola.snpSift.SnpSift.run(SnpSift.java:372) at ca.mcgill.mcb.pcingola.snpSift.SnpSift.main(SnpSift.java:70) Caused by: java.lang.RuntimeException: Impropper VCF entry: Not enough fields (missing tab separators?). ^_<8B>^H^D^@^@^@^@^@<ff>^F^@BC^B^@06<ed>}<eb>S#ɱ<ef>g<fc>W(<ec>^O<d7>ESCGh<ea><fd>И{C^C^LK<98>^A^F<d8><f1><f5>u8^T^Z^P<a0><b3>B<u+0092><d8><d9><f1>_^?2<eb>Ѫ<96><aa><d5> at ca.mcgill.mcb.pcingola.vcf.VcfEntry.parse(VcfEntry.java:888) at ca.mcgill.mcb.pcingola.vcf.VcfEntry.<init>(VcfEntry.java:165) at ca.mcgill.mcb.pcingola.fileIterator.VcfFileIterator.parseVcfLine(VcfFileIterator.java:112) ... 7 more cat: write error: Broken pipe /var/spool/torque/mom_priv/jobs/5800932.SC: line 19: vcf: command not found