Hi all,
I am trying to do some association study using SNPs called from RNA-seq experiment. My study system is Rattus norvegicus.
Now I called SNPs from case samples and control samples and then merged the two VCF files using vcf-merge. On merging, I noticed that if there is a SNP in 'case' that is absent in 'control', the output is 00 in .ped file generated from the VCF format. Sometimes, 00 represents missing value and in some cases it represents the reference. As a result I have a LOT of 00 in my .ped file which is messing up my association test.
I understand that one can use --merge
with --merge-mode 5
from plink but for my study system I don't have a reference vcf with genotypes. That means, I cannot do imputation using reference panel. I tried imputation using Beagle that does not require reference panel. However, my sample size is extremely small for that to work (4 cases and 4 controls) properly.
Do you have any suggestion what and how I could take care of this imputation problem for my samples? Please help. I appreciate it.
Thank you