Hi Everyone,
I have two separate raw VCFs dataset processed by GATK version 3.5 (one from the population of ~ 2600 and one from the population of ~160). Since the upstream data cleaning and processing phases were done in elsewhere, I do not have the access to gvcf files. In order to combine these two populations, instead of joint genotyping via gvcf, is it possible to just merge the vcf files using existing tools ? Do you think it will introduce the batch effects or how to minimise it ? I can run from scratch (Bam files), but it will take a lot of computational resources since they are whole genome data. Feel free to contact me if you do not understand my questions.
Sincerely,
I have edited your title to make it more specific about what you are asking, because "Genomics and VCF files" is meaningless.