TSV fie to dataframe R for Anova

0

Entering edit mode

4.4 years ago

kbaitsi • 0

I have a tsv file with 61 columns and 18703 lines (genes). Ι want to convert it in a appropriate dataframe in order to perform an Anova Analysis. The tsv file contains of 6 conditions (WT, TG, A, B, C, D). I have written the following code

f<-read.table(file = "GeneExpressionDataset_normalized.tsv", sep="\t", header=TRUE) 
data.frame(Expression=as.numeric(f[1,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))

for the first line but I am not sure how to loop this in order to get a dataframe for all the lines.

I tried

ff<-sapply(1:nrow(f),function(i){
  x<-as.numeric(f[i,2:61])
  data.frame(Expression=x, Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))  
})

and

a <- for (i in 1:nrow(f)){
  data.frame(Expression=as.numeric(f[i,2:61]), Condition = c(rep("WT", 10), rep("TG", 10), rep("A", 10), rep("B",10), rep("C", 10), rep("D", 10)))
}

but it's not working. Any suggestions?

tsv dataframe r anova loop • 1.1k views

ADD COMMENT • link 4.4 years ago by kbaitsi • 0

0

Entering edit mode

What is the final goal? Differential expression?

ADD REPLY • link 4.4 years ago by ATpoint 88k

0

Entering edit mode

Yes, that's right...

ADD REPLY • link 4.4 years ago by kbaitsi • 0

1

Entering edit mode

Then why not using established, well-tested and specialised software such as limma. Please go through its very extensive vignette. Other options for DE can be DESeq2 or edgeR but these strictly require the raw counts, you seem to have normalized counts, therefore limma-trend pipeline could be an option.

ADD REPLY • link updated 4.4 years ago by Ram 45k • written 4.4 years ago by ATpoint 88k

Login before adding your answer.