Differentially expressed genes from expression table (RNAseq)
1
0
Entering edit mode
5.5 years ago

I am learning to analyze RNA-seq at GEO. I am aware that raw counts can be processed by EdgeR or DESeq2 to obtain DEGS. However, while looking at the supplementary data for GSE130883 I found an expression table that looks like:

    ID                  e1      e2      e3        
    ENSMUSG00000069049  3.9853  3.98668 3.98668
    ENSMUSG00000069045  2.86804 2.83166 2.80527
    ENSMUSG00000068457  1.96894 1.99508 1.87452
    ENSMUSG00000056673  2.2292  2.14263 2.02953
    ENSMUSG00000025332  2.54212 2.56631 2.56794

Is there a way to obtain DEG from these results? Is it possible to use limma or t-test or are there dedicated routines?

Thank you.

DEG expression table RNA-seq GEO • 1.7k views
ADD COMMENT
1
Entering edit mode

It is just a matrix of numbers - can you read the related manuscript to find out to what these numbers relate, exactly? Then, we can better advise.

Manuscript: Sex-Dependent Sensory Phenotypes and Related Transcriptomic Expression Profiles Are Differentially Affected by Angelman Syndrome.

ADD REPLY
0
Entering edit mode

Thank you for your response. Firstly, I did not cover all columns of data in my original post. The column headings (12 total) go as follow:

WT_M_6GCCAATL007SAll_PE
WT_M_5ACAGTGL007SAll_PE
WT_M_10TAGCTTL007SAll_PE
WT_F_8ACTTGAL007SAll_PE
WT_F_7CAGATCL007SAll_PE
WT_F_2CGATGTL007SAll_PE
AS_M_9GATCAGL007SAll_PE
AS_M_3TTAGGCL007SAll_PE
AS_M_12CTTGTAL007SAll_PE
AS_F_4TGACCAL007SAll_PE
AS_F_1ATCACGL007SAll_PE
AS_F_11GGCTACL007SAll_PE

where

WT= wild type
AS= angelman syndrom
M/F= male/female

The study itself studies the effect of sex on Transcriptomic Expression Profiles of Angelman syndrom rats.

I hope I got the point of ypur question.

ADD REPLY
1
Entering edit mode

Okay, it is good that they have 3 replicates per group. I assume that these expression values are the normalised+transformed counts? - in this case, they should be suitable for any downstream analysis that you want to perform, e.g., clustering, 'machine learning' stuff, etc. You can also justify the use of ANOVA, t-test, Limma, etc.

Just try to confirm how this data was produced, though - it must state it in the Methods or Supplementary Methods, somewhere.

I would also check the distribution of the data via hist() and boxplot()

ADD REPLY
0
Entering edit mode

Thank you fo your respoonse. Is there a "preferred" method of obtaining DEGs in data that are normalised+transformed?

ADD REPLY
1
Entering edit mode

Not of which I am aware. Once the main program (DESeq2, EdgeR, etc) normalises and transforms the data, it is basically saying: 'Do whatever you want with this data'. If you still want to err on the side of caution, then use non-parametric tests (Kruskal-Wallis ANOVA, Mann-Whitney U test, Wilcoxon signed-rank test, Spearman correlation, etc.).

If you are aiming to perform differential expression comparisons, then could you not obtain the original raw data and re-process it (and perform the comparisons within DESeq2, EdgeR, etc)?

ADD REPLY
0
Entering edit mode

Thanks for your time and response. Raw data is available in my case (from SRA) but I dont know how to analyze that.

ADD REPLY
1
Entering edit mode

I see. In that case, please use the supplementary data

ADD REPLY
1
Entering edit mode
5.5 years ago
ATpoint 85k

Not answering your question directly but you can always download the raw data from the ENA => Fast download of FASTQ files from the European Nucleotide Archive (ENA), use a computationally unexpensive quantification pipeline like salmon/tximport on them and then proceed with raw counts using either of the established tools. Both salmon(for quantification of fastq files against a reference transcriptome in fasta format) and tximport (to read the transcript counts into R and summarize them to the gene level) have very good documentation.

ADD COMMENT
0
Entering edit mode

Thank you for your response. Is salmon/tximport able to analyze human RNA-seq data on a 4G ram laptop?

ADD REPLY
1
Entering edit mode

I think that should be OK, but I haven't tested exporting the quantifications to a genome alignment .bam file for visualization (and you may need a genome alignment if testing re-analysis with other programs).

However, checking the robustness of gene level (or transcript) level assignments between biological replicates is already a good start (and something that you can do on your laptop).

ADD REPLY

Login before adding your answer.

Traffic: 1682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6