Entering edit mode
10.5 years ago
devenvyas
▴
760
I have a list of SNPs from an old Illumina array. I've concatenated and filtered two sets of VCF files (http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/VCF/, http://cdna.eva.mpg.de/denisova/VCF/hg19_1000g/) from two diploid genomes I am trying to work with. I am now trying to merge the two VCFs into a single file
using
vcf-merge AltaiNea.recode.vcf.gz Denisovan.recode.vcf.gz | bgzip -c > isec.vcf.gz
but I am getting these errors below. I need help understanding 1) what this text actually means and 2) how to fix the error. I have tried using vcf-isec instead, but I get the same/similar errors.
gzip: stdout: Broken pipe
Using column name 'DenisovaPinky' for Denisovan.recode.vcf.gz:DenisovaPinky
Could not determine the ploidy (nals=1, nvals=3). (TODO: ploidy bigger than 2)
3
at /apps/vcftools/0.1.11/lib/perl5/site_perl/Vcf.pm line 177, <__ANONIO__> line 2.
Vcf::throw('Vcf4_1=HASH(0x1a9a1e0)', 'Could not determine the ploidy (nals=1, nvals=3). (TODO: ploi...', 3) called at /apps/vcftools/0.1.11/lib/perl5/site_perl/Vcf.pm line 2408
VcfReader::guess_ploidy('Vcf4_1=HASH(0x1a9a1e0)', 1, 3) called at /apps/vcftools/0.1.11/lib/perl5/site_perl/Vcf.pm line 1764
VcfReader::parse_AGtags('Vcf4_1=HASH(0x1a9a1e0)', 'HASH(0x18c3528)') called at /apps/vcftools/0.1.11/bin/vcf-merge line 461
main::merge_vcf_files('HASH(0x1598108)') called at /apps/vcftools/0.1.11/bin/vcf-merge line 12
(END)
Thanks!
-Deven
Do the VCF files individually pass vcf-validator? My guess would be not.
Do you know of any fix? I did not generate the VCFs on my own. I just downloaded them from the Max Planck.
Can you post a link to them? That'd make it easier to determine exactly what's causing this (though I have my guesses).
Here and here
I have downloaded them, filtered them down based on the list of SNPs on my array, concatenated them. I then ran one last step to filtered out both non-biallelic sites and sites labeled LowQual.
I ran vcf-validator before and after that last step.
I got error lines like this back
and this
The list is much longer for the Neanderthal than the Denisovan.
I am currently validating all the original files to see whether the error occurred somewhere in my actions or if it was like this from the beginning
UPDATE: The original files are starting to come out from vcf-validate, they are all failing. What do I do?