Hi! I have problems generating the kinship matrix from my vcf with emmax. I used this line of code, which worked with a different vcf before just fine:
emmax-kin -v -d 10 ${emmaxInput}
However, with my current vcf, everything I get is a kinship matrix full of -nan
entries. The vcf has a proper header, has been filtered to maf=1% (I've also tried with 5, 10 and 20%) and a call rate of 100%. It also contains proper SNP names (e.g. chr;12345
). The tped and tfam from the vcf look no different and I've already compared the structure of my pipeline output from the working vcf with the vcf that causes this problem. Can't figure out what's happening here.
I have >1200 samples from RAD-Seq. Both vcfs (working and not-working) have been generated from the same raw data, but were aligned to different reference genomes (can't imagine this is causing the problem).
Do you know what causes problems with the NaN in the kinship matrix besides monomorphic SNPs and call rate? Many thanks!
EDIT: It looks like the problem arises when the chromosome names contain characters. I changed the names to numbers only and the -nan disappear. I got a proper looking kinship matrix now. But I can't tell why the names cause such problems in the first place (aren't they just names??)
Hi, thank you very much for your last solution. It works. No information about this in the documentation, and no idea why it happens.
Hi! Thanks for sharing your solution! It works! :)