Hi everyone,
I am trying to calculate genome-wide dn/ds ratio between two closely related rat species using PAML. I have genome co-ordinates for transcript orthologs for the two species. However, I am not able to figure out how to decide the reading frame for translating the sequences in to amino acid sequences. Currently, I am translating into reading frame 1 for all genes on forward strand and into reading frame 6 for all on reverse strand. I know this is gross approximation. Can anyone give me any ideas on how I could tackle this issue? Where can I find information on transcript specific reading frame? I am ignoring the problem of alternate splicing for now.
Without the information on reading frames, I am ending up with biased dn/ds ration. Please help me. Thank you.
You just want the sequences of all the transcripts first?
I have the nucleotide sequences of all transcripts. I can't decide which reading frame to translate each one into. Should I select the longest ORF for each transcript?
It is typical to take the longest ORF. But here is a relevant study.