Hi! I want to use Beagle version 5.1. to exclusively impute my missing genotypes (no ref. panel needed). I used .ped and .map files to create my .vcf file using plink1.09b. My file 394.RMV-UNKNOW-SEXUAL-Filtered.vcf looks like:
contig=<ID=29,length=51502869>
INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real refere$
FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 1_V_UK561500501863_plate1_A01 2_V$
1 16947 BovineHD0100000005 C T . . PR GT 0/1 0/0 0/0
1 135098 Hapmap43437-BTA-101 T C . . PR GT 0/1 0/1 0/0
1 149772 BovineHD0100000042 G A . . PR GT 0/0 0/0 0/1
1 158820 BovineHD0100000048 G T . . PR GT 0/1 1/1 0/0
1 163995 BovineHD0100000051 T C . . PR GT 0/0 0/0 0/0
1 183040 BovineHD0100000057 T G . . PR GT 0/1 0/1 ./.
1 267940 ARS-BFGL-NGS-16466 T C . . PR GT 0/1 0/0 1/1
1 290690 BovineHD0100000082 A C . . PR GT 0/0 0/1 0/1
When I run the command:
java -Xss5m -Xmx4g -jar /exports/eddie/scratch/v1mmart8/Test/beagle.18May20.d20.jar gt=394.RMV-UNKNOW-SEXUAL-Filtered.vcf out=394_imputed impute=false
The output created 394_imputed.vcf.gz is empty. The error file says_ERROR: REF field is not a sequence of A, C, T, G, or N characters at 1:78986953 [D]
I tried to include missing=./. in the command but beagle does not recognize it.
Any idea why is it not running? Many thanks!
I might be wrong but I did not see the option
missing
. Also should your REF field contain missing genotypes? I think you have a malformed VCF file, or theID
field contains some special character that is causing that error. Can you find1:78986953
in the vcf file and post it? The other issue I noticed is that your missing data is.
when I think it should be./.
.Are you trying to impute 3 samples without a reference panel? I do not think this will work well at all. I would think more like 100-1000 might be okay. The more the better without a reference panel.