Hello,
I have posted several times regarding a genome assembly project, which we are now trying to improve. However, the ultimate goal of the project is to look for evidence of selection in genes from our assembly by comparing with orthologs from closely related species (e.g. chicken genome). Although I am not to this point yet, I am trying to research how this is done since I have little scripting experience.
In detail, I would like to align (using Muscle) each annotated gene in our genome with the corresponding ortholog it aligns to from a reference genome, and then calculate a dN/dS ratio for each ortholog in PAML. The problem is that although I know how to do this by hand (individual pairs of orthologs), I will most likely have thousands of orthologs from the annotated genome to align with each corresponding chicken gene and then output for calculation of dN/dS.
I am assuming one needs to develop custom Perl or Python scripts using for loops to automate this process, but I am wondering whether there are any existing software that would make this easier with limited perl/python experience. I would be happy to learn perl or python and have played with biopython to a limited degree, but I am having trouble finding good example scripts to get me started. Most of the comparative genomics literature I have read does not discuss development of scripts, just software used.
Thanks very much,
Zach Gayk
I don't know if it would make it easier, and I myself don't use BioPerl a lot, but BioPerl has interfaces for performing and processing alignments and for dn/ds calculation. Once you overcome the initial BioPerl learning difficulties, it should be easy to continue.