Entering edit mode
3.3 years ago
Gautier
•
0
Hi,
I'm trying to plot sequences of uneven length using ggseqlogo in Rstudio. However, the data arguments specifies that all sequences must have same width :
Error in letterMatrix(seqs) : Sequences in alignment must have identical lengths
How can I use ggseqlogo and add a blank when there is no longer amino acid in the sequence ?
Here are some of my data :
CASSLRGQGVEKLFF
CASLSQGTEAFF
CASSVGPGQTEAFF
CATSLGQSTDTQYF
CASSQDRGNSPLHF
CASSLDLRVNTEAFF
CASSQDLRVATEAFF
CASSPDREQYF
CASFGGPRTTEAFF
CASSVFYDSGANVLTF
CSARIPGTSGAYGYTF
CASSLRGQGVEKLFF
Thank you for your help !
What about introducing a gap character?
I never tried making logo from the alignment but still it's just a thought.
Like adding gaps character until the maximum length is reached ? I thought about that, but this annoys me for further analysis...
You could add a
X
(undetermined/any AA) at end to make these same length and then plot a logo.Hmmmmm, instead of adding continuous gaps until the maximum length why don't you perform alignment (let's say using Clustal Omega), so that you will have an optimal alignment and optimally introduced gap characters.
Just to cross-check I performed multiple sequence alignment using sequences provided in the question and I copied the MSA in R and used
ggseqlogo
for creating the logo and I guess it worked perfectly fine.Input:
MSA:
Logo: