Question

Mutation effect on expression (RNA-Seq) using limma

0

Entering edit mode

9.0 years ago

Nicolas Rosewick 11k

Hi,

I'm investigating the potential effect of several mutations on gene expression. I've RNA-seq from ~30 tumors and 4 control samples. I screened for these 30 tumors about 20 genes for mutations. I translated these mutation data into a binary matrix (rows = samples, columns = genes ; 0 = not mutated, 1 = mutated).

For the RNA-Seq I aligned with STAR and counted the number of reads per gene with featurecounts. Then I merged everything into a big read count matrix (column = sample ; rows=gene).

Now I want to detect genes which expression is affected by these mutations. So I used limma as follow:

The design matrix harbors a column for each gene (~20), and an additional column indicating if the samples is a tumor or not. I also add an offset column. For example :

mut dataframe :

         offset geneA   geneB   geneC   isTumor
sample1      1      0       1       1         1
sample2      1      1       1       0         1  
sample3      1      0       0       1         0

So :

design <- mut
design <- design[,colSums(design)>2] # remove gene that is mutated in less than 3 samples
design <- cbind(offset=1,design) # add offset
# egdeR
dge <- DGEList(counts=counts)
dge <- calcNormFactors(dge)
cpm <- cpm(dge,log=T)
keep.exprs <- rowSums(cpm>1)>=3 # min 3 samples with CPM > 1
dge <- dge[keep.exprs,]
# limma voom
geneExpr <- voom(dge,design)
glm = lmFit(geneExpr$E, design = design) 
glm = eBayes(glm)

testResults <- decideTests(glm, method="global",adjust.method="BH", p.value=0.05)[,-1]

Now in testResults I've a matrix containing for each gene and each covariate (the 20 screened genes for mutation) if it's up (=1) or down regulated (=-1) ; (no significant change = 0). Is that correct for you ?

Thanks

edit : I posted the question to bioconductor forum as adviced by Goutham Atla. https://support.bioconductor.org/p/86092/

limma RNA-Seq • 2.8k views

ADD COMMENT • link 9.0 years ago by Nicolas Rosewick 11k

0

Entering edit mode

You would get much quicker response on https://support.bioconductor.org for questions related to statistics and bioconductor packages. Most of the authors are very active there.

ADD REPLY • link 9.0 years ago by GouthamAtla 12k

score 0 · Answer 1 · 2016-08-11

0

Entering edit mode

9.0 years ago

Devon Ryan 105k

What you're doing doesn't make any sense. You're applying the same model matrix to every gene. What you want is a different model matrix for each gene. That is, for each gene you want a design of the form ~tumour + mutant, where each of those is a factor with two levels. How to do this has been covered elsewhere.

ADD COMMENT • link 9.0 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Devon. FYI I based my analysis on this paper : http://www.nature.com/ncomms/2015/150109/ncomms6901/full/ncomms6901.html . Supplementary data 2 contains the original R code I used to wrote the few lines above. So I'm confused now because their experimental design is quiet similar..

ADD REPLY • link 9.0 years ago by Nicolas Rosewick 11k

0

Entering edit mode

Perhaps I'm misunderstanding the question you're trying to answer with the test then. I assumed you were trying to see if, in your tumor and normal samples, the presence of a mutation in gene X had an effect on the expression of gene X. If, instead, you want to see if mutations in gene X might be affecting gene Y or Z then this method makes sense.

ADD REPLY • link 9.0 years ago by Devon Ryan 105k

1

Entering edit mode

Yes I want to see the effect of mutation in gene X on all the genes in the count table. And that for all of the 20 genes I screened.