Entering edit mode
4.4 years ago
kbaitsi
•
0
I have a tsv file with 61 columns and 18703 lines (genes). Ι want to convert it in a appropriate dataframe in order to perform an Anova Analysis. The tsv file contains of 6 conditions (WT, TG, A, B, C, D). I have written the following code
f<-read.table(file = "GeneExpressionDataset_normalized.tsv", sep="\t", header=TRUE)
data.frame(Expression=as.numeric(f[1,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))
for the first line but I am not sure how to loop this in order to get a dataframe for all the lines.
I tried
ff<-sapply(1:nrow(f),function(i){
x<-as.numeric(f[i,2:61])
data.frame(Expression=x, Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))
})
and
a <- for (i in 1:nrow(f)){
data.frame(Expression=as.numeric(f[i,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))
}
but it's not working. Any suggestions?
What is the final goal? Differential expression?
Yes, that's right...
Then why not using established, well-tested and specialised software such as
limma
. Please go through its very extensive vignette. Other options for DE can be DESeq2 or edgeR but these strictly require the raw counts, you seem to have normalized counts, thereforelimma-trend
pipeline could be an option.