Entering edit mode
9.5 years ago
biotech
▴
570
I need to perform various analysis focused on a particular protein family, trying to reveal some of its evolutionary chracteristics. I would like to automate as much as possible this task. The sequences are now inside genomic data, so maybe I need to start by extracting them from the genomes?
I will work with 200 bacterial genomes. Also, since there are repetitive sequences inside this target proteins, some of them will have a gap. Hope this is not a big caveat.
Thanks
what does it mean "various analysis"?
Enough for a paper.
Domain annotations, dS/dN ratios, homology, phylogenies, rearrangements, among others.
I though about creating some kind of database of target protein sequences and then running programs in a semi-automated fashion.