Hi to all I have blast output for many sequences with only one aligment in the output (blast-options numalignments 1 numdescriptions 1) done with Paramecium protein database. I want to count the substitution for all amino acids with respect to query sequence. ie.How many times Gly replaced by Val in subject, etc. From this output/data I want to calculate log-odds score for protein substitution levels between these 2 species. Are tools available to do this or make such things easier or I have to write (perl-which I know) scripts.
Anybody wants to improve the question for clarity or accuracy feel free to do it!
There is one tool that does something like that but not exactly http://www.ncbi.nlm.nih.gov/pubmed/12854978 MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.
thank you raghul
I want to calculate the substitution rate between 2 closely related species only. For eg,I have translated protein sequences (partial & full CDS) from transcriptome of Tetrahymena malaccensis.So I want to calculate the substitution rate of amino acids between Tetrahymena malaccensis & Tetrahymena thermophila to notice amino acid changes that could have contributed to the thermophilic lifestyle of T.thermophila. Hope my explanation is ok thank for the reply raghul
Above still applies: just use blast to get the orthologs.
ok thanks. Can you reference a publication that explains clearly (to an extent), so I can learn better.