Hello, I'm trying to form a fasta file to represent the genome of a sub-population. Currently, I have the initial reference genome (fasta) and a vcf file with two individuals from the sub-population. How can I randomly substitute one of the genotype calls from the vcf into the reference genome at the designated position?
Thank you very much!
I know of GATK's AlternateReferenceMaker, but I have been unable to find how it decides which genotype to replace the reference with if multiple are listed. What I'm looking to do is randomly substitute one in (I'd also like to be able to just pick the most common, but that is a separate query).
Hello jautis,
I have the same task; could you guide me how to do it? How did you do this? I mean how to apply all the variations in a VCF file to the reference genome to create a sample genome?
I'm new in bioinformatics.
Thanks
I used the GATK method suggested by Ashutosh Pandey. It worked pretty well