I have DNA sequencing from two highly inbred genetic lines of chicken, so I expect them to be almost entirely homozygous. I'm interested in getting a VCF file of the SNPs that exist between these two lines. It seems like there are two ways I can do this using GATK:
1) Run GATK twice, once on the data from each line, then look for SNPs that exist with the reference genome in one but not the other, or SNPs that exist in both but that have different alternate bases.
2) Run GATK once with both sets of data, then filter the results by SNPs with allele frequency of 0.5 or frequency of 1.0 with two alternate alleles.
Are these two methods equivalent, or would one produce more accurate results than the other? Is there a more direct way to call SNPs between two genetic lines, where it doesn't call SNPs relative to the reference genome?
If you've VCF files may be giving a try to VCF-compare could be good. It tells your mismatches/matches per person. You can filter SNPs based on overlap in reference and other sets eventually. You can use vcftools to filter snps from vcf files.