So, putting the final solution you should use on top. In R it is best to use existing packages and functions. Package Biostrings has a function alphabetFrequency:
library(Biostrings)
mydata=read.table("C:...", header=TRUE,sep=",")[,8]
sapply(mydata, function (seq) {
AAs=alphabetFrequency(AAString(seq))
pie(AAs[AAs>0]) # remove all 0 counts, otherwise ugly pie
})
Let's start with making the code work as it is, because you are close to a working solution, after that let's remove all the beginner mistakes.
AAs=list() # list instead of vector
BLAST_AA_seqs=c()
for(i in 1:nrow(mydata)){
print(i)
BLAST_AA_seqs[i]=mydata[i,8]
AAs[[i]]=table(strsplit(BLAST_AA_seqs[i],"", useBytes=TRUE)) ## Lists use [[]] instead of [] for indexing
pie(AAs[[i]], col=rainbow(length(AAs)), main="Residue abundance")
}
look at the output of table:
A L M T U X
2 1 3 1 1 3
A list is the right datatype to store a collection of other data objects in R, not a vector, as vector elements must be of the same type. AAs[i] was referring to a vector element of length 1 while the table has a different length and hence cannot replace a single vector element, instead a list can contain objects of any type and length. Note that pie can only plot one vector at a time.
Reduced solution with loop, there's a lot of redundancy in the first code, it is in principle a one-liner.
mydata=read.table("C:...", header=TRUE,sep=",")[,8]
# we don't need to keep the data we do not need
# mydata=as.matrix(mydata) # not necessary, a dataframe can be indexed just fine
# AAs=c()
# BLAST_AA_seqs=c() # Not needed as you do not want to store the data for later
for(i in 1:length(mydata)){
seq = mydata[i] # use local variables instead
AAs = table(strsplit(seq,"", useBytes=TRUE))
pie (AAs, col=rainbow(length(AAs)), main="Residue abundance")
}
In R, for loops shouldn't be used, instead use apply
, sapply
, etc.:
mydata=read.table("C:...", header=TRUE,sep=",")[,8]
pdf() # you will get a lot of plots, this will make a pdf of your plots with one plot per page
sapply(mydata, function (seq) {
AAs = table(strsplit(seq, "", useBytes=TRUE))
pie (AAs, col=rainbow(length(AAs)), main="Residue abundance")
} )
dev.off()
http://stackoverflow.com/questions/27245246/a-for-loop-with-strsplit-in-r-error
first answer there is wrong...