I am learning to analyze RNA-seq at GEO. I am aware that raw counts can be processed by EdgeR or DESeq2 to obtain DEGS. However, while looking at the supplementary data for GSE130883 I found an expression table that looks like:
ID e1 e2 e3
ENSMUSG00000069049 3.9853 3.98668 3.98668
ENSMUSG00000069045 2.86804 2.83166 2.80527
ENSMUSG00000068457 1.96894 1.99508 1.87452
ENSMUSG00000056673 2.2292 2.14263 2.02953
ENSMUSG00000025332 2.54212 2.56631 2.56794
Is there a way to obtain DEG from these results? Is it possible to use limma or t-test or are there dedicated routines?
Thank you.
It is just a matrix of numbers - can you read the related manuscript to find out to what these numbers relate, exactly? Then, we can better advise.
Manuscript: Sex-Dependent Sensory Phenotypes and Related Transcriptomic Expression Profiles Are Differentially Affected by Angelman Syndrome.
Thank you for your response. Firstly, I did not cover all columns of data in my original post. The column headings (12 total) go as follow:
where
The study itself studies the effect of sex on Transcriptomic Expression Profiles of Angelman syndrom rats.
I hope I got the point of ypur question.
Okay, it is good that they have 3 replicates per group. I assume that these expression values are the
normalised
+transformed
counts? - in this case, they should be suitable for any downstream analysis that you want to perform, e.g., clustering, 'machine learning' stuff, etc. You can also justify the use of ANOVA, t-test, Limma, etc.Just try to confirm how this data was produced, though - it must state it in the Methods or Supplementary Methods, somewhere.
I would also check the distribution of the data via
hist()
andboxplot()
Thank you fo your respoonse. Is there a "preferred" method of obtaining DEGs in data that are normalised+transformed?
Not of which I am aware. Once the main program (DESeq2, EdgeR, etc) normalises and transforms the data, it is basically saying: 'Do whatever you want with this data'. If you still want to err on the side of caution, then use non-parametric tests (Kruskal-Wallis ANOVA, Mann-Whitney U test, Wilcoxon signed-rank test, Spearman correlation, etc.).
If you are aiming to perform differential expression comparisons, then could you not obtain the original raw data and re-process it (and perform the comparisons within DESeq2, EdgeR, etc)?
Thanks for your time and response. Raw data is available in my case (from SRA) but I dont know how to analyze that.
I see. In that case, please use the supplementary data