Hi,
Because I lack any bioinformatic expertise we paid a company to perform RNA sequencing for us and I received preanalyzed RNA sequencing data that was generated with DESeq2 and for FDR Calculation Method Benjamini-Hochberg procedure was used.
I have 4 groups: one control, one disease and two disease with different drugs.
Only in the comparison of control to disease I got results files with both significant p and padj values. For one of the disease-drug comparison groups all the padj values are the same (0.99978212639514). For looking at differential gene expression are there alternative ways that do not include the use of padj values?
Also are there easy to use tools for the visualization of this data? So far I came across this website: http://www.webgestalt.org/
You might want to have a look at this post and its relative comments https://support.bioconductor.org/p/51916/#51952. As you can notice, getting the same padj values for multiple genes can be a direct consequence of the BH method, which is the one used in your case.
Thank you.
If you paid them then you should ask them. Seriously, the job of analysis is not to just run some commands but to explain output to the user. Did they do any QC? Lack of DEGs could indicate underpowered study or batch effects.
Hi, Yes they did QC I went through the report they provided and the single steps with a bioinformatician from our university (only available 1h per week...) he said all the criteria for good quality data are fullfilled and there seems to be no batch effect. In his mind this could also just mean that the drugs did not have a strong effect that could be detected by bulk sequencing.
Yes, true biological lack of DEGs is of course an option. I would visualize the fold changes, something like a Volcano or MA-plot comes to mind.
if you have the data, redo the x scale (log2fold change). Seems there are genes with no or near no fold change but with high statistical significance. In my opinion, this is odd. Please check look at the MA plots and thresholds used filtering out the low count genes.
The problem is I have some raw data but no scripts or documentation how they were plotted. Also this figure is not editable. How would you want to redo the x-scale? Sorry if I am slow, but this is my first time looking at such data.
In most of the figures, scale on x-axis is with relevant to 0 (for fold changes). In the image, you posted, it's from left to right (3 units between). I would prefer a scale on x-axis with respect to zero (and that is the whole point of log scale).
I think p-values reached to the lowest values that software/platform can print and adjusted p values are converted accordingly. Reach out to the service provider.
Thanks you for the possible explanation. I will try to contact them, but these big companies don't have great customer service and it takes them forever to reply...
I think @ Marco Pannone explanation seems to be correct to me for identical corrected p-values