I have a set of homologous gene pairs from two complete bacterial genomes. I also have the whole chromosome and GFF files of gene locations. I want to find the coordinates for non-synonymous SNPs in each gene relative to its homolog.
Is there a tool where I can feed the gene pairs or even better chromosome+GFF and get non-synonymous SNP positions back?
This is simple enough to script using blast or clustal but checking just in case there is something out there. I was not able to find a BioPerl module or tool for this but maybe I missed something obvious. Thanks much!
Thank you for the reply. Methods suggested work off a read mapping input file, like most tools available today. What I have are two complete genes that need to be compared for non-synonymous SNPs.
ok but what is the format of you data then? A fully assembled sequence?
Crazy idea: what you could do is simulate error and mutation free reads from one genome, align against the other genome. Generate long reads that will not be misaligned, now you have an alignment file, for read generator tools see
wgsim
for example: https://github.com/lh3/wgsimI have the complete nucleotide sequence of genes (avg 840 bp) for both genomes from Refseq. I also have the ids of orthologous gene pairs from both genomes, if that helps.
That crazy idea crossed my mind too :-) But thats a last resort. I was thinking of using a simple blast to compare the pair and pull out SNPs. But that I have to use codon modules to figure out the type of SNP so that's not straightforward too
I see, I don't know of a tool that would do the work unfortunately.
Thanks anyways. Other folks I have spoken to have been stumped too.