Mutation effect on expression (RNA-Seq) using limma
1
0
Entering edit mode
8.2 years ago

Hi,

I'm investigating the potential effect of several mutations on gene expression. I've RNA-seq from ~30 tumors and 4 control samples. I screened for these 30 tumors about 20 genes for mutations. I translated these mutation data into a binary matrix (rows = samples, columns = genes ; 0 = not mutated, 1 = mutated).

For the RNA-Seq I aligned with STAR and counted the number of reads per gene with featurecounts. Then I merged everything into a big read count matrix (column = sample ; rows=gene).

Now I want to detect genes which expression is affected by these mutations. So I used limma as follow:

The design matrix harbors a column for each gene (~20), and an additional column indicating if the samples is a tumor or not. I also add an offset column. For example :

mut dataframe :

         offset geneA   geneB   geneC   isTumor
sample1      1      0       1       1         1
sample2      1      1       1       0         1  
sample3      1      0       0       1         0

So :

design <- mut
design <- design[,colSums(design)>2] # remove gene that is mutated in less than 3 samples
design <- cbind(offset=1,design) # add offset
# egdeR
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
cpm <- cpm(dge,log=T)
keep.exprs <- rowSums(cpm>1)>=3 # min 3 samples with CPM > 1
dge <- dge[keep.exprs,]
# limma voom
geneExpr <- voom(dge,design)
glm = lmFit(geneExpr$E, design = design) 
glm = eBayes(glm)

testResults <- decideTests(glm, method="global",adjust.method="BH", p.value=0.05)[,-1]

Now in testResults I've a matrix containing for each gene and each covariate (the 20 screened genes for mutation) if it's up (=1) or down regulated (=-1) ; (no significant change = 0). Is that correct for you ?

Thanks

edit : I posted the question to bioconductor forum as adviced by Goutham Atla. https://support.bioconductor.org/p/86092/

limma RNA-Seq • 2.5k views
ADD COMMENT
0
Entering edit mode

You would get much quicker response on https://support.bioconductor.org for questions related to statistics and bioconductor packages. Most of the authors are very active there.

ADD REPLY
0
Entering edit mode
8.2 years ago

What you're doing doesn't make any sense. You're applying the same model matrix to every gene. What you want is a different model matrix for each gene. That is, for each gene you want a design of the form ~tumour + mutant, where each of those is a factor with two levels. How to do this has been covered elsewhere.

ADD COMMENT
0
Entering edit mode

Thanks Devon. FYI I based my analysis on this paper : http://www.nature.com/ncomms/2015/150109/ncomms6901/full/ncomms6901.html . Supplementary data 2 contains the original R code I used to wrote the few lines above. So I'm confused now because their experimental design is quiet similar..

ADD REPLY
0
Entering edit mode

Perhaps I'm misunderstanding the question you're trying to answer with the test then. I assumed you were trying to see if, in your tumor and normal samples, the presence of a mutation in gene X had an effect on the expression of gene X. If, instead, you want to see if mutations in gene X might be affecting gene Y or Z then this method makes sense.

ADD REPLY
1
Entering edit mode

Yes I want to see the effect of mutation in gene X on all the genes in the count table. And that for all of the 20 genes I screened.

ADD REPLY
0
Entering edit mode

OK, then it totally makes sense to me, for whatever that's worth :)

ADD REPLY

Login before adding your answer.

Traffic: 1646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6