Hi all, I'm trying to use the SnpSift extractFields (SnpSift 4.1g (build 2015-05-17)) command to change a mpileup gVCF file into a nice tab delimited text file. I have been able to do this with previous files, but they were not gVCF (GATK Haplotype caller) files.
I have also tried outputting the file using GATK VariantsToTable function, but it does not accept SnpEff fields. I did not add the SnpEff data, it comes from another colleague.
The SnpSift command I use:
java -jar ~/void/tools/snpEff/SnpSift.jar extractFields -s "," -e "." grep.vcf CHROM POS ID REF ALT "ANN[*].GENE" "ANN[*].IMPACT" "ANN[*].EFFECT" >out.txt
I get the error Cannot find 'Description' in info line:
'##FORMAT=<ID=AD,Number=.,Type=Integer,Description=Allelic depths for the ref and alt alleles in the order listed>
Oddly, I am able to get output if I leave out any extra information to extract besides CHROM POS ID REF ALT. But I want all that extra information that was added with SnpEff.
My VCF header is too big to be supported by Biostars. I can also email it directly if anyone thinks they can help me out to fix the output table problem. I have emailed the developer about it but haven't had a response (4 days).
Thanks very much, Tesa
Hi Pierre, Thanks for this. I think you are on to something. This didn't fix it but now I have a new error which apparently goes to the next offending line.
I hope there aren't a bazillion of these lines. Any idea how to find offenders more efficiently than one-by-one?
Cheers, Tesa