Hello everyone,
I want to find differentially expressed genes between sensitive and non-sensitive cells using a simple T-test and FDR control.
To generate the expression data for me:
library(affy)
library(qvalue)
cell.lines<-read.csv("sensitivity.csv",row.names=1,header=TRUE,sep=",")
pheno<-as.data.frame(cell.lines[,2],row.names=row.names(cell.lines))
names(pheno)<-c("sensitivity")
pheno<-new("AnnotatedDataFrame", data = pheno)
wangData<-ReadAffy(phenoData=pheno)
wangExpr<-rma(wangData)
sampleNames(wangExpr)
featureNames(wangExpr)
description(wangExpr)
annotation(wangData)
pData(phenoData(wangData))
dim(exprs(wangExpr))
head(exprs(wangExpr))
x<-exprs(wangExpr)
then I got as an example the following records:
1. cell.lines:
celline drugsens
GSM243480 PC3 s
GSM243481 DU145 s
GSM243482 LNCaP s
GSM243483 22Rv r
where s = sensitive and r = recurrent
2. x:
GSM243480.CEL GSM243481.CEL GSM243482.CEL GSM243483.CEL
1007_s_at 8.947219 9.126816 9.061964 9.571040
1053_at 8.892565 8.663903 8.610118 10.069438
117_at 4.657418 4.589023 4.437430 4.574685
121_at 7.232597 7.349899 7.597990 8.310655
1255_g_at 3.335694 3.359144 3.351804 3.201163
the columns are the cells and the lines the genes.
I've been thinking that I can write me a function that computes me the test:
A wrapper function to perform a t-test for all genes
my.t.test <- function(x, group, alternative = "two.sided"){
x1 <- t(x[group[,1]])
x2 <- t(x[group[,2]])
tres <- t.test(x = x1, y = x2, alternative = alternative)
fc <- mean(x1) - mean(x2)
stat <- tres$statistic
if(alternative == "two.sided"){
stat <- abs(stat)
}
res<-matrix(c(stat, tres$p.value, fc), ncol=3)
colnames(res)<-c("statistic","p.value","fc")
return(res)
}
test sensitiv versus non sensitiv genes:
sVr<-cbind(cell.lines$drugsens=="s",cell.lines$drugsens=="r")
colnames(sVr)<-c("sensitiv", "rezesiv")
SensitivVersusRezesiv <- apply(X=x, MARGIN=2, FUN=my.t.test, group=sVr)
SensitivVersusRezesiv <- as.data.frame(t(SensitivVersusRezesiv))
colnames(SensitivVersusRezesiv)<-c("statistic","p.value","fc")
SensitivVersusRezesiv <- cbind(t(x[0,]),SensitivVersusRezesiv)
The problem is that I now only get one record, with the statistics and p-values for one cell type, but not for the genes. At the moment I can not go any further. Do you happen to have an idea how to solve this better?
Thank you for your help!
please tell me you've got more than one resistant cell line; and please learn
limma
https://bioconductor.org/packages/release/bioc/html/limma.htmlI don't think this is a good idea. There are advanced statistical frameworks for differential expression analysis.
Yes, but that should be our job. Do you know how to get it out with the t-test?
https://ibb.co/eO9BVe
Please use
ADD COMMENT
orADD REPLY
to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.Right. Sorry, I just did not pay attention. I will pay attention to it now.
You will also find this useful: How to add images to a Biostars post
Using a straight t-test is a fools errand. Learn how to do it using moderated t-tests in limma
But I can not use that. In the picture I have attached, it says that I should do it only with a t-test first. For me it does not work at the moment. Is "limma" the same?
So this is an assignment?
Yes, I have more than one. This should only be an excerpt.