Are there any online services or biopython scripts that will calculate the codon usage average for all sequences in a file not individually for each sequence. I only know python, unix and a little bit of perl.
Are there any online services or biopython scripts that will calculate the codon usage average for all sequences in a file not individually for each sequence. I only know python, unix and a little bit of perl.
The EMBOSS program cusp takes one or more nucleotide sequences as input and outputs codon usage data, looking like this (first few lines):
#CdsCount: 1
#Coding GC 67.79%
#1st letter GC 67.88%
#2nd letter GC 46.89%
#3rd letter GC 88.60%
#Codon AA Fraction Frequency Number
GCA A 0.077 7.772 3
GCC A 0.462 46.632 18
GCG A 0.462 46.632 18
GCT A 0.000 0.000 0
....
There are a number of EMBOSS servers if you want to run the analysis online.
Not online but CodonW does this.
For the web services, you can find some services in the BioCatalogue.
You can then run those services with Taverna.
You could try http://www.bioinformatics.org/sms2/codon_usage.html and their codon usage tool. There is a mechanism to use the tool off-line. See link on the above page. I have no experience with this tool, but you may need to concatenate your individual sequences into one.
http://www.bioinformatics.fr/bioinformatics.php?subsection=Codon%20usage
Leaving this here for the future reference... Following up on the concatenation approach, you could make use of the Bioconductor packages in R to do this: concatenate the sequences using Biostrings, and then analyse codon usage with coRdon.
https://bioconductor.org/packages/release/bioc/html/Biostrings.html
https://bioconductor.org/packages/release/bioc/html/coRdon.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Not sure what is meant by "codon usage average". Perhaps you mean "frequency" ?