Is it valid to compare the expression of pre-defined candidate genes from RNA-seq data?
2
0
Entering edit mode
7.0 years ago

Hi,

In addition to a classic differential expression analysis, I'd like to investigate the expression of pre-defined 'candidate genes' from my RNA-seq data.

What I've done: from a TMM-normalised transcript quantification matrix (the same kind of matrix that is leveraged by differential expression analysis by edgeR, voom, DESeq2 etc..), I pulled out the genes I was interested in and scaled/log-2 transformed the TMM counts to produce heatmaps that do show condition-specific expression patterns (but these didn't show up in the differential expression analysis). I frame it as an exploratory analysis, not as a definite differential expression analysis.

However, I can not find any published similar workflow. I tried various ways to google this question, and I could not find a single comment on such approach, which I find very surprising. Can anyone help by providing some sources or insights they have on this?

Thanks, Antoine

RNA-Seq • 1.7k views
ADD COMMENT
0
Entering edit mode
7.0 years ago
theobroma22 ★ 1.2k

Just because you observe differences in your candidate genes doesn’t mean they are DE genes. Typically a pre-defined p-value is used as a cut-off to define the DE genes. If your candidates aren’t at least less than or equal to a p-value of 0.5, in your next experiment you could assume a greater p-value like 0.8, since you currently are establishing an apriori. Anyhow, how did you come about your candidates? Have they been biologically shown to control the phenotype? Perhaps those are false positive candidates?

ADD COMMENT
0
Entering edit mode
7.0 years ago

To be specific, I'm trying to investigate if immune genes show condition-specific expression profiles. I first produced heatmaps with a set of immune genes that do show consistent differences between conditions across replicates. Then, I did a simple glm(scale(log2(TMMs))~condition) that showed significant estimates for condition. I understand that it can only be an exploratory and has to be taken with caution as there are likely false positives in there. But I'm also surprised not to find similar approaches anywhere I looked for. Is it that flawed to interrogate a TMM database from RNA-seq data with good sample replication?

ADD COMMENT
0
Entering edit mode

A bit of background: I work on a non-model organism - the Argentine ant - that does have a sequenced genome but limited annotation, and no actual functional characterisation. Which is why I try to look at the big picture as much as I can, because not much is known in the genes that turned out to be differentially expressed after an actual DE analysis.

ADD REPLY

Login before adding your answer.

Traffic: 1368 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6