I need to create a fasta file that contains only CDS for each sample that I have NGS and genotyped using gatk. I've used gatk FastaAlternateReferenceMaker and then BEDtools and the .gff to pull out all the exons (or CDSs), but this does not put the coding sequences together for each gene. Also, gatk FastaAlternateReferenceMaker outputs a fasta with chromosome names listed chr1...etc. (i.e. not matching the names in the .gff). My genome has many contigs and it is time consuming to change these by hand. Is there a better way to do this? Any tools out there exist to go from a vcf file to a fasta file specific to each sample I've sequenced that has the CDS for each gene?
I need this fasta to eventually feed into PAML so I can calculate dn/ds for each gene. If there's a better way to do this also, please let me know.
Thanks!