How to choose a reference set of highly expressed genes for calculating Codon Adaptation Index? What if I do not have any experimental evidence related to their expression levels or on the other hand what if I have a newly annotated genome? Is it still possible to compute CAI?
Why do you think that highly expressed genes in particular would make a good CAI reference? Personally, I would go for a highly conserved (between taxa) set of genes.
Thanks for the reply. Yes. It might be a good idea if I want to see deviation of a coding sequence between taxa. But, in my case I would want to see the similarities between coding sequences within the same organism/taxon so as to find out genes which have some correlation like they are similarly expressed, or like they belong to same pathway, or they are functioning towards a defined biological objective, etc.
Codon Adaptation Index (CAI) was first introduced by Sharp and Li to measure synonymous codon usage bias for a DNA or RNA sequence.It also measures the resemblance between the synonymous codon usage of a gene and the synonymous codon frequencies of a reference set.CAI was originally proposed to provide an estimate that can be used across genes and species, ranging from 0 to 1.
If a gene always uses the most frequently used synonymous codon in the reference set,then CAI=1. If a gene always uses the least frequently used synonymous codon in the reference set,then CAI=0.
abdulwrs7 if you want to promote your website (given it has bioinformatics-related relevance) then please open a new question and choose type Blog. Adding the exact same answer to five old threads that already received (accepted) answers should not really be the way to go here.
Thanks for the reply. Yes. It might be a good idea if I want to see deviation of a coding sequence between taxa. But, in my case I would want to see the similarities between coding sequences within the same organism/taxon so as to find out genes which have some correlation like they are similarly expressed, or like they belong to same pathway, or they are functioning towards a defined biological objective, etc.