Hello,
I am trying to write a new fasta of protein sequences which contains all of the SNVs I have identified from WGS for one strain. I know roughly how to get there, but not the exact tools available.
So far, I have:
- Created a new gene fasta using gatk's FastaAlternateReferenceMaker. Using the -L option to only write genes into the fasta.
I know I can use biopython to convert this DNA fasta to an AA fasta, yet all of the genes on the reverse strand are reverse complemented in the new gene fasta. Is there a way to either change the negative strand genes to their reverse complement or tell a program this when it is translating the sequences. I could use the bed file as a reference/dictionary.
Thanks!
Does the fasta file made by FastaAlternateReferenceMaker provide any information from which strand it has extracted the gene? E.g. are there numbers in the `>' line that make it obvious which strand was used? Can you provide an example of this fasta file?