Hi
I am using vcf-merge to join to vcf files, one contain the data of 56 individuals and the other of only one. One of my files have 524754 SNPs and the other 105878 however when I join them I obtain just a few SNPs ~10000 SNPs and I obtain the next message error:
Wrong number of values in LANMLR17B064:250550525/PL at 1:21709201 .. nAlleles=4, nValues=3.
Expected 10 values for diploid genotypes or 4 for haploid genotypes.
at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/Vcf.pm line 172, <$__ANONIO__> line 10372.
Vcf::throw(Vcf4_0=HASH(0x7b12a8), "Wrong number of values in LANMLR17B064:250550525/PL at 1:2170"...) called at /LUSTRE/storage/data/software/vcftools/lib/perl5/site_perl/Vcf.pm line 1767
VcfReader::parse_AGtags(Vcf4_0=HASH(0x7b12a8), HASH(0xb4d080)) called at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line 464
main::merge_vcf_files(HASH(0xa4ea10)) called at /LUSTRE/storage/data/software/vcftools/bin/vcf-merge line 12
Do you know what is happening here? Why I obtain just a few SNPs in the merged file?
Posting the lines containing '1:21709201' (for example, search for it using grep) could help - sounds like that SNP has some weird alleles in your two files
Ok thanks, I removed some individuals that I suspect have sequencing errors but the problem is the same, here are the first lines of my two files that I am merging:
1 10045 S1_10045 G C . PASS DP=36 GT:AD:DP:GQ:PL ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:2,0:2:79:0,6,72 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:2,0:2:79:0,6,72 0/0:2,0:2:79:0,6,72 ./.:0,0:0 0/0:2,0:2:79:0,6,72 ./.:0,0:0 ./.:0,0:0 0/0:2,0:2:79:0,6,72 0/0:3,0:3:88:0,9,108 0/0:1,0:1:66:0,3,36 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 1/1:0,1:1:66:36,3,0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:3,0:3:88:0,9,108 0/0:2,0:2:79:0,6,72 1/1:0,2:2:79:72,6,0 ./.:0,0:0 ./.:0,0:0 1/1:0,1:1:66:36,3,0 1/1:0,1:1:66:36,3,0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:2,0:2:79:0,6,72 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 ./.:0,0:0
1 157465 S1_157465 C T . PASS DP=42 GT:AD:DP:GQ:PL ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:2,0:2:79:0,6,72 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 1/1:0,1:1:66:36,3,0 ./.:0,0:0 ./.:0,0:0 1/1:0,1:1:66:36,3,0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 0/0:2,0:2:79:0,6,72 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 0/0:2,0:2:79:0,6,72 ./.:0,0:0 ./.:0,0:0 1/1:0,1:1:66:36,3,0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 0/0:2,0:2:79:0,6,72 0/0:1,0:1:66:0,3,36 ./.:0,0:0 0/0:3,0:3:88:0,9,108 0/0:1,0:1:66:0,3,36 1/1:0,3:3:88:108,9,0 ./.:0,0:0 ./.:0,0:0 1/1:0,1:1:66:36,3,0 1/1:0,1:1:66:36,3,0 0/0:3,0:3:88:0,9,108 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 0/0:1,0:1:66:0,3,36 ./.:0,0:0 1/1:0,3:3:88:108,9,0 0/0:1,0:1:66:0,3,36 ./.:0,0:0 ./.:0,0:0 0/0:1,0:1:66:0,3,36 ./.:0,0:0
1 2064 . C CTTT 622.73 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=-0.347;ClippingRankSum=-1.807;DP=94;ExcessHet=3.0103;FS=4.769;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.619;QD=10.04;ReadPosRankSum=1.882;SOR=0.450 GT:AD:DP:GQ:PL 0/1:35,27:62:99:660,0,1325
1 2179 . T C 2109.77 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=1.590;ClippingRankSum=-1.191;DP=119;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=-0.548;QD=17.73;ReadPosRankSum=3.280;SOR=0.633 GT:AD:DP:GQ:PL 0/1:49,70:119:99:2138,0,1187
Do you still have the same error involving that particular SNP at 21709201? What do the files print there?
Ok, I got it,
I have this:
1 21709201 S1_21710020 CA NT,NG,N . PASS DP=341 GT:AD:DP:GQ:PL 0/0:5,0,0,0:5:96:0,15,180 0/1:3,2,0,0:5:99:57,0,93 0/0:4,0,0,0:4:94:0,12,144 0/1:2,2,0,0:4:99:60,0,60 0/0:9,0,0,0:9:99:0,27,255 0/0:4,0,0,0:4:94:0,12,144 ./.:0,0,0,0:0 0/0:1,0,0,0:1:66:0,3,36 0/0:3,0,0,0:3:88:0,9,108 0/1:5,3,0,0:8:99:84,0,156 0/0:4,0,0,0:4:94:0,12,144 ./.:0,0,0,0:0 0/1:3,1,0,0:4:99:24,0,96 0/0:5,0,0,0:5:96:0,15,180 0/0:6,0,0,0:6:98:0,18,216 0/0:4,0,0,0:4:94:0,12,144 0/0:7,0,0,0:7:99:0,21,252 0/0:4,0,0,0:4:94:0,12,144 0/0:2,0,0,0:2:79:0,6,72 0/0:3,0,0,0:3:88:0,9,1080/0:2,0,0,0:2:79:0,6,72 0/0:2,0,0,0:2:79:0,6,72 0/0:6,0,0,1:7:98:0,18,216 0/0:4,0,0,0:4:94:0,12,144 0/1:3,1,0,0:4:99:24,0,960/0:3,0,0,0:3:88:0,9,108 0/0:4,0,0,0:4:94:0,12,144 0/3:3,0,0,1:4:88:0,9,108 ./.:0,0,0,0:0 1/1:0,4,0,0:4:94:144,12,0 0/0:1,0,0,0:1:66:0,3,36 0/0:3,0,0,0:3:88:0,9,108 0/0:2,0,0,0:2:79:0,6,72 0/0:4,0,0,0:4:94:0,12,144 0/0:3,0,0,0:3:88:0,9,108 1/0:3,6,0,0:9:99:189,0,81 0/3:1,0,0,1:2:66:0,3,36 1/0:1,2,0,0:3:99:63,0,27 ./.:0,0,0,0:0 0/0:5,0,0,0:5:96:0,15,180 0/0:3,0,0,0:3:88:0,9,108 0/1:1,1,0,0:2:99:30,0,30 0/0:1,0,0,0:1:66:0,3,36 0/0:5,0,0,0:5:96:0,15,180 0/1:3,1,0,0:4:99:24,0,96 1/1:1,6,0,0:7:96:195,0,15 ./.:0,0,0,0:0 1/0:1,2,0,0:3:99:63,0,27 ./.:0,0,0,0:0 2/2:0,0,1,0:1:33:0,0,0 0/0:6,0,0,0:6:98:0,18,216 0/1:1,1,0,0:2:99:30,0,30 0/0:4,0,0,0:4:94:0,12,144 0/0:3,0,0,0:3:88:0,9,108 1/1:0,1,0,0:1:66:36,3,0 0/0:3,0,0,0:3:88:0,9,108 1/2:0,3,2,0:5:88:108,9,0 0/0:2,0,0,0:2:79:0,6,72 0/0:2,0,0,0:2:79:0,6,720/0:6,0,0,0:6:98:0,18,216 0/0:2,0,0,0:2:79:0,6,72 0/1:8,4,0,0:12:99:108,0,252 0/0:3,0,0,0:3:88:0,9,108 0/0:5,0,0,0:5:96:0,15,180 0/0:2,0,0,0:2:79:0,6,72 1/0:2,3,0,0:5:99:93,0,57 0/0:2,0,0,0:2:79:0,6,72 0/3:2,0,0,2:4:79:0,6,72 0/0:9,0,0,0:9:99:0,27,255 0/0:3,0,0,0:3:88:0,9,108 0/0:9,0,0,0:9:99:0,27,255 0/0:5,0,0,0:5:96:0,15,180 1/0:1,5,0,0:6:98:162,0,18 1/0:1,2,0,0:3:99:63,0,27 0/0:8,0,0,0:8:99:0,24,255 0/0:4,0,0,0:4:94:0,12,144 0/0:3,0,0,0:3:88:0,9,108 0/0:10,0,0,0:10:99:0,30,255 0/0:3,0,0,0:3:88:0,9,108 0/0:7,0,0,1:8:99:0,21,252 0/0:4,0,0,0:4:94:0,12,144 ./.:0,0,0,0:0 0/1:2,2,0,0:4:99:60,0,60 0/0:5,0,0,0:5:96:0,15,180 0/0:5,0,0,0:5:96:0,15,180 0/0:3,0,0,0:3:88:0,9,108 0/0:3,0,0,0:3:88:0,9,108 0/0:3,0,0,0:3:88:0,9,108
and for LANMLR17B064:250550525 in this line I have
0/3:1,0,0,1:2:66:0,3,36
Also this problems could be because my vcf files have different versions one is VCFv4.0 and the other VCFv4.2. What do you think? I also know that in the GT filed we could have 0/0, 0/1 and 1/1 but in this case we have 0/3 that is weird right?
Is the 0/3 in the second file for a different alternative allele? In the first file, 0/3 means CA/N, what does it mean in the second file?
Ok, thanks.
In V4.0:
And in v4.2 the definition is the same:
So I guess that I am not going to have problems with that.
At the end I use bcftools and I removed the PL and GQ fields. For the rest It looks that the data is the same and can be combined. What do you think?