Question

Pairwise alignment and Ka/Ks computation

0

Entering edit mode

15 months ago

maxime.policarpo ▴ 200

Dear all,

I have a multi-fasta file which consists in 1,000 coding sequences. I would like to compute the Ka/Ks between each pair of sequences.

For the moment, I am performing a multiple sequence alignment at the protein level using MUSCLE v5 followed by trimming and back-translating the protein alignment into a coding sequence alignment using trimAl. I can then use seqinr (R package) to compute all possible pairwise Ka and Ks values.

I now would like to perform the same kind of analysis but instead of performing a multiple sequence alignment, I would like to perform pairwise sequence alignments. I could use a for or a while loop in bash, using muscle and seqinr at each iterations but with 1,000 sequences in the file, this would mean 499,500 pairwise alignments followed by and Ka/Ks computations ...

Furthermore, I would actually want to repeat that for many different genes, so for many different multi-fasta files each containing 1,000 sequences and sometime even more. The largest multi-fasta file I have contains 9,000 sequences (which means 40,495,500 pairwise comparisons)

Do anyone have an idea on how I could achieve that, or of another method to perform such pairwise alignment + ka/ks calculation very rapidly ?

Thanks for any help !

All the best,

Maxime Policarpo

Pairwise-Alignment Genomics Ka-Ks • 350 views

ADD COMMENT • link updated 15 months ago by Ram 45k • written 15 months ago by maxime.policarpo ▴ 200