I have 220 nucleotide sequences. They are around 7000 bp and each of them should translate into one protein of around 2100 amino acids. There are some functions already available in R that can convert nucleotide sequence to amino acid sequence. But, these available functions like Translate
in seqinr
or trans
from ape
package require start and end position. But my sequences don't have same start position. For exampe one sequence has start position at 650 and another at 720 and so on. When I am doing manually I use Expasy online tool which works pretty well. It there any way to translate all these sequence programmatically in R ? Is it possible to send my sequence to Expasy webpage using R studio and retrieve amino-acid sequences ? I am comfortable using R/Bioconductor but I can use Biopython too if there is a way to do using Biopython.
Thanks !
So, you have multiple sequences with different start points relative to one another? Do you need to perform a multiple sequence alignment before translation, or has this already been done?
I am going to do selection pressure analysis. So, I will have to do multiple sequence alignment before I could actually do selection pressure analysis.