Heatmap: edgeR counts and DEG
0
0
Entering edit mode
6.4 years ago
RiNG ▴ 10

I have used edgeR in Galaxy to perform differential expression analysis. As output I have 3 files with a list of differentially expressed genes (comparison between 3 different groups of samples) with log(FC), log(CPM), FDR, etc. I also have a separate file containing the normalized counts, but for all the different samples within each group.

To make a heatmap out of the diferentially expressed genes, how can I cross the information between the counts file and the DEG files to select only the genes of interest in the counts file?

Thanks in advance.

rna-seq heatmap edgeR • 4.3k views
ADD COMMENT
0
Entering edit mode

Take your Differential expressed genes list and get the counts data of those genes. Convert counts to logCPM and use that for heatmap.

ADD REPLY
0
Entering edit mode

It would really help us if you pasted samples of the data that you have. Otherwise, we can only speculate as to their formatting / structure.

ADD REPLY
0
Entering edit mode

How can I get the counts data for specific genes in an Excel file with >20000 genes? If I use the "Find" function I would take ages.

ADD REPLY
0
Entering edit mode

How did your data end-up in Excel? Better to export your Excel data in TSV or CSV format, and then read that into a R Programming Language environment.

ADD REPLY
0
Entering edit mode

It was a TSV file I got as output in Galaxy after running edgeR.

If I open this file in Rstudio, how can I then find the genes I am interested in?

ADD REPLY
0
Entering edit mode

Can you paste the top-left corner of the data? I assume that it is genes as rows and samples as columns?

ADD REPLY
0
Entering edit mode

GeneID 19_c2.trimmed.fastq.sorted.bam 24_c3.trimmed.fastq.sorted.bam .....

ENSSSCG000000...

ENSSSCG0000...

This is it; so yes rows is genes and columns for samples (8 samples).

ADD REPLY
0
Entering edit mode

These are ENSEMBL gene ideas Sus scrofa (pig). You can likely convert these using biomaRt package in R. Galaxy should also have a gene conversion tool, no?

ADD REPLY
0
Entering edit mode

That is not a problem. My problem is that I have +20000 genes in a counts file and I only want to select a few.

Is there a function in R that can find and return only the rows I am interested in? And reduce the counts matrix from 20000 to 100 genes of interest?

ADD REPLY
0
Entering edit mode

Yes, if these ENSEMBL IDs are the rownames of your object, then just do this:

genesOfInterest <- c("ENSSSCG0000001", "ENSSSCG0000056", "ENSSSCG005555", "ENSSSCG000009", "ENSSSCG003332")

MyData[which(rownames(MyData) %in% genesOfInterest),]

There are a few ways of doing it, though.

ADD REPLY
1
Entering edit mode

It works! Thank you for your time.

ADD REPLY

Login before adding your answer.

Traffic: 1975 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6