Using not significant differentially expressed genes
6
2
Entering edit mode
8.1 years ago
Lluís R. ★ 1.2k

Most studies with microarrays and other technologies involve selecting which genes are differentially expressed (DE) genes by comparing the study case with the controls. However those genes that are not significant or don't have a high DE may play an important role due to the interaction with other proteins and genes (which may be DE).

Of course it could be hard to separate those genes that don't change because they don't have a role on the disease/study case and those who don't change and have a role on the process.

Has it not been done because it has been proven that only the DE genes are involved on the processes? Is there any paper which use such approach? Am I missing something with this idea?

DE theory • 6.3k views
ADD COMMENT
0
Entering edit mode

It could be that genes that are not significantly differentially expressed between disease and control still have a role in disease. What I mean is that their encoding proteins are differentially expressed (translated) or modified (e.g., phosphorylated) between disease and control. You'll need proteomics data to see if this is true.

ADD REPLY
0
Entering edit mode

Of course having other source of information could confirm that such a gene is important. But I was thinking about using other approaches with microarrays which could also benefit the studies.

ADD REPLY
2
Entering edit mode
8.1 years ago

I think, it is a matter of what you want to learn and (more importantly) what you are able to measure. The increase of a gene's RNA does not necessarily mean that the protein product is related to the condition (e.g. disease). As an example, b.nota already mentioned the influence of phosphorylation.

Then it also depends on the type of analysis you carry out. For example, you can analyze your DEGs in a transcription-regulation network to search if they have a regulatory gene in common which could be responsible for the change in transcription (e.g., check out IPA Upstream Regulator Analysis for that). The regulator you identify may not be a DEG but may be differentially activated protein - e.g. by phosphorylation.

To sum up, analysis of DEGs may not give you the answer for treatment but so does no large-scale screening analysis. However, given the constraint that you only have expression data, DEG are your best chance to start your search for genes associated with your condition. ... :-)

ADD COMMENT
0
Entering edit mode

With expression data one could (or should be able to) do more things than just DEG. But DE analysis is the easiest way to analyse expression data, and should not be dismissed.

ADD REPLY
0
Entering edit mode

Hmm... can you give an example how to associate non-DEGs with your condition/experiment?

I think, that it may be hard to differentiate between non-DEGS associated with your condition and non-DEGs not associated with your condition. Similar to statistical tests where it is "easy" to proof that two samples are distinct (e.g., by means of a particular p-Value) but much harder to proof that two samples are equal!

PS: What comes to my mind is using data in non-parametric methods or without setting explicit thresholds. But nonetheless, also for these methods the genes need to show some association to the experiment...

ADD REPLY
0
Entering edit mode

For example the methods related to networks as vakul.mohanty posted below, or others like DiffCoEx or WGCNA . Usually in those tools, the association between genes and a particular condition/variable is done with a correlation.

ADD REPLY
1
Entering edit mode
8.1 years ago
Farbod ★ 3.4k

Dear Lluís R, Hi

It is a very interesting topic you have started, and thank you because of that.

I think the design of sampling has a key role here, too.

For example searching for the sex-biased genes, I have compare male and female gonads of different sexes of fish (with biological replications) and in the PCA test there was not something very clear separated, maybe one reason is because the majority of the genes that are expressing in the gonad of two sexes are autosomal genes (and maybe we must not expect such PCA at all in these situations)!

Again, in the regard of DEG analysis and FDR and fold-change, the genes with very sharp FC and expression was related to stress response (maybe the stress the fish has encounter) not the sex differentiation. So the climax of the volcano plot did not contain very interesting news for us!

In other experience, we have used several DEG analysis software (e.g DESeq, DESeq2, voom, edgeR and . . . ) and some of them showed DEG with stronger biological concept than the others. So this fact that which package you are using may have some effects of next steps of analysis.

Finally, if there is no DEG analysis pipeline, which threshold should we consider for trapping the genes that dictate a trait/situation and what would be our "start point" ?

And please have a look at this beautiful paper, specially the part below:

"This analysis revealed that a factor even more important than mouse genotype was the experimenter performing the test, and that nociception can be affected by many additional laboratory factors including season/humidity, cage density, time of day, sex and within-cage order of testing."

~ Best

ADD COMMENT
1
Entering edit mode

I would recommend using multidimensional scaling (MDS) plots instead of PCA to gasp more information about the similarity of the samples. Sometimes PCAs and MDS aren't similar and MDS is more precise to discover the differences between samples.

If most genes were autosomal maybe you could check the model.matrix used to make the DE analysis, and consider using surrogate variables (sva), or batch effects estimations (together with all the proper normalizations done with voom, DESeq2, ....), to take into account the factors described in the paper you linked. At least that worked for me in an analysis.

About the topic in discussion here: The starting point can be the existing literature and the already known information of other papers or databases or the network of gene expression. (or any new proved method by anyone)

ADD REPLY
1
Entering edit mode
8.1 years ago
vakul.mohanty ▴ 270

There are methods that take into account the whole dataset instead of just using differentially expressed genes to identify functional relationships. Here's a paper you could use as a reference and maybe find a tool that best suits you http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-13-282.

ADD COMMENT
0
Entering edit mode

Hi vakul.mohanty,

Very nice reference you have mentioned, thank you.

Can you please describe how NetWalker work and what is it's input data (in case of RNA-seq projects) if you have any personal experience about it?

ADD REPLY
1
Entering edit mode

From the tutorial, it seems one should provide a tab-delimitated file as input, I would say rpkm or raw counts.

ADD REPLY
0
Entering edit mode

Thanks for the paper, is there any R package of NetWalker?

ADD REPLY
0
Entering edit mode

I think it provides a standalone gui. I think it can interact with R but I'm not really sure how. https://www.ncbi.nlm.nih.gov/pubmed/20808879 should have the algorithmic details

ADD REPLY
0
Entering edit mode
8.1 years ago
Satyajeet Khare ★ 1.6k

Hi Lluis,

Genes can also undergo differential processing. They can use alternate promoters/exons/termination sites. Such RNA processing events may or may not lead to differential expression. Moreover, such events are not picked up in routine microarray analysis. Differential processing can however significantly change the gene function. I have come across one onco-protein which is more potent in one splice form than the other.

Best

ADD COMMENT
0
Entering edit mode

That's precisely why I am surprised of so few methods for studying such genes. I am sure it is more common than I think, and could explain more about the disease than just DEG.

ADD REPLY
0
Entering edit mode

Such analysis isn't really possible with microarrays. However if you have RNAseq data you can address some of these limitations. RNAseq will allow you to look at differential usage of elements (exons,introns etc) miso (http://miso.readthedocs.io/en/fastmiso/#installing-fastmiso) and majiq(http://majiq.biociphers.org/), kma models actual retention of introns in transcripts (https://github.com/pachterlab/kma/blob/master/vignettes/kma.Rmd). I'm pretty sure there's much more out there.

With the right sort of data and analytical approach there's much you can infer in addition to differential expression.

ADD REPLY
1
Entering edit mode

Thanks for the tools for RNA-seq analysis! In microarray data is possible to do other studies than DEG, see my comment above with examples of tools with other approaches to microarray expression data.

ADD REPLY
0
Entering edit mode
7.3 years ago
shania90.lk ▴ 30

Hi,

Not sure whether this was mentioned before in the posts. But just when I was looking at your explanation, from the top of my head I think of gene co-expression networks. What you are saying is absolutely true. The genes that are differentially expressed may not be the only important genes in a particular situation. A transporter for example, might not be a DE gene but still may very well be active during a stress condition. What I have found throughout my not so long PhD journey is that (to mention that I'm no expert at all and well be missing something here), using gene co-expression networks to identify those genes have helped in your context. I would like to refer to this paper that mentions how the shift from differential gene expression to differential networking (specifically weighted gene co-expression networks) affect the study of genetics of diseases, specifically.

Cheers,
Shani

ADD COMMENT
0
Entering edit mode

I was aware of the network point of view, but you brought a nice paper, it points software (and thus new methodology) I hadn't heard before.

ADD REPLY
0
Entering edit mode
7.3 years ago
theobroma22 ★ 1.2k

SPIA: Signalling Pathway Impact Analysis Bioconductor package considers both Non-DE and DE genes to get significant pathways. Researchers usually defer to Gene Ontology and KEGG Orthology, but opposed to these types of enrichment analyses SPIA is a topology-based analysis, and so is considered more powerful in terms of the result.

ADD COMMENT
0
Entering edit mode

Thanks I am aware of SPIA (although I have never used it), all GSEA methods use both DE and non DE genes. Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1635 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6