Hi,
I work with non-model organisms trying to understand the genes under positive selection in non-model fish species.
I de novo assembled the transcripts using TRINITY
Removed redundant transcripts using CDHIT
Using Transdecoder predicted ORF
The longest ORFs were subjected to Orthofinder to know the MSA species alignment and the species tree (raxml-ng).
The species tree and MSA file seems like the orthofinder aligned the orthologs shared among the non-model fish species(9).
I'm interested to know what are the genes under positive selection for this I converted MSA.fa alignment file to a Phylip format and subjected into codeml/PAML package. It not working for my data, it because of large dataset I suppose.
Error from codeml : 386273 nucleotides, not a multiple of 3!%
Some help is needed. Does my approach is it correct ?
phylip file:
9 386273
SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLETSLAEH
SRR363207 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363206 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363202 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363201 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363204 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363203 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
SRR363205 MSPGVELIKM KTEITTAVGF ITRLLRTTGL ISDEQLQHFS ESLEKSLAEH
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
YRHHWFPHMP CKGSGYRCIR INHKMDPLIA RASNIIGLSS QQLFQLLPSE
tree file:
((SRR363206:0.012416,SRR363205:0.069747):0.0036785,((SRR363205:0.013234,(SRR363205:0.00518,((SRR363203:0.00817,SRR363201:0.002449):0.000959(SRR363202:0.003255,SRR363204:0.003052):0.001105):0.002049):0.005519):0.005375,SRR363207:0.016243):0.0036785);%
Suggestions please.
Thanks
Kevin
I realized lately, I had a MSA alignment file (orthologs aligned) how do I convert in to PAML format for positive selection analysis?
Suggestions please.