Dear Biostars, Hi.
I have used DESeq2 for my RNA-seq DEG analysis (de novo assembly using Trinity, RSEM, DESeq2) I have used FDR=0.05 as my threshold.
Now I have some genes that in similar researches was up-regulated in one condition (e.g in males) but in my results the same gene has the log2FC = 5 AND the FDR is 0.1 or more.
What is the best decision for me now ?
I appreciate any helpful guidance
~ Best
So the log2FC is rather high, but so is FDR. To me this suggests that there is an effect for that gene, but some heterogeneity in your groups has made the overall effect insignificant. You could make a violin/boxplot/dot plot of normalised expression per sample for those genes (assuming the number is reasonably low for semi-manual investigation). As such you can judge if the heterogeneity per group is high, whether you have outliers,... Maybe there is a technical effect or covariate you didn't model yet.
Hi and thanks,
This gene has about 11 isoforms, most of them (7) have FC from 0.5 to 5 in males but two of them have up-regulated FC from -0.6 to -1.2 in females.
the FDRs are from 0.1 to 1.
Do these extra information have any effect on your guidance?
So you are performing differential expression on the isoform level? What if you collapse these isoforms and compare on the gene level?
Upregulated but negative FC?
Good question. ;-)
In the gene level DEG analysis pipeline of Trinity, FDR for these 11 isoforms is 0.8 and the log2FC is -0.6 toward the females (which is not correct because (1) most isoforms are up-regulated in males, (2) the other researches showed that this gene is up-regulated in males)
What do you mean by "heterogeneity " here ?
would you kindly explain more for me, please?
I don't know how big your groups are but, let's use the following hypothetical, oversimplified example: you have 6 treated and 6 control samples for a certain gene X. When treated, most individuals will heavily upregulate gene X well above the control expression. So there is differential expression. But 2 out of 6 treated individuals are a bit different, for example, have a different gender, a mutation, different food,... which causes their expression of gene X to go a bit lower than in control individuals.
So overall, the treatment group has a higher expression of X than the controls but due to heterogeneity in the treatment group this effect is masked/insignificant. A boxplot or violinplot would show you those outliers. If you identify the cause of such heterogeneity (if it exists in your study!) you could specify this in the design formula/matrix for DESeq2.
1- thank you for your complete explanation.
I have 3 biological replication for males and 3 for females (So, 6 samples in total and I am comparing DEG between sexes).
I have TMM values and Trinity have used it for matrix and heatmap creation. can I use the TMM values for my box-plot drawing ?
this gene has 11 isoforms, so I have 11*6 numbers (for my biological replications). Do I must use all of them for my box-plot ?
2- Do you have any idea about alternative splicing and different biological effect of these 11 isoforms between different sexes ? maybe 3 of them act in testis but most of the other act in ovary? is it arguable ?
Which tissue are you investigating, and why do you expect a gender-effect there?
Using the TMM values would be fine for making boxplots or similar (I'm a big fan of violin plots).
Are all those 11 isoforms expressed in your tissue of interest? Because that's quite a lot... Surprises me (but of course it's not impossible).
Dear WouterDeCoster,
I am using the gonads of each gender.
Trinity usually reports several isoforms for each gene.
Sounds like a suitable tissue for gender-specific genes, although you also have a tissue effect vs gender effect.