Dear community, I am doing a whole-genome phylogenomic analysis of a diploid fungus. I took only the primary (considering it as a collapsed assembly) contigs of phased assembly as a reference and aligned whole-genome sequencing reads of more than 150 isolates. I did extensive variant filtering but still, the number of the variants across all 150 isolates are too high to be handled. Hence I thought to only take variants from the genic region for further analysis. Therefore I have the following questions
Would this be okay if I only take variants in the genic region for further analysis?
If it is okay to use the genic region for population genetic analysis then could someone suggest to me how I best extract variants from the genic region (with the help of a gff file, I guess) from the multi-samples vcf file which I have generated using freebayes. Thanks