Hi members,
I have a VCF file (from GATK) containing variants for a total of 20 individuals and I'm wondering how to get the consensus sequences for each individual regarding its own polymorphism. Some individuals may not show polymorphism at a particular position in a contig whereas some others may. I've checked the GATK dedicated tool (FastaAlternateReferenceMaker) but it doesn't answer my question as only one consensus is generated. My requirement would be to get as many outputs files (containing consensus file) as mapped individuals.
Do any of you faced a similar question?
Thanks for your reply, Best, C.
what do you mean with "consensus sequence" ? how is it different from the REF/ALT columns ? can you show us a few rows of your VCF ?
By consensus sequence, for each individual, I mean that I would like to obtain one sequence per individual that contain all the variant site included in the reference sequence used for mapping. Maybe the use of 'consensus' is confusing.
Here is a subset of my VCF, showing the first two variant sites for 20 individuals: